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CONCEPT FORMATION OF WORK-STUDY 
SKILLS BY USE OF AUTOBIOGRAPHIES 
IN GRADE FOUR 


WALLACE J. HOWELL 


Principal, George M. Diven School 
Elmira, New York 


By definition, ‘‘a concept is an idea that includes all that is 
characteristically associated with, or suggested by, a term.” A 
term, to become meaningful and applicable, must become a part of 
the child’s sensory impressions and his thinking. If this is to be 
accomplished in the many areas of education, all terms must be 
presented so that they are understood, and suitably integrated 
into the learning experiences of the child. In the study here re- 
ported a program of work-study skills was decided on to provide 
the learning experiences, but the desirability of testing its effec- 
tiveness before it was put into operation was recognized. The 
improvement in performance could be ascertained by the use of 
an achievement test, but it was also desired to learn the extent 
to which the experiences would become internalized by the pupils. 
On the principle that individuals tend to report spontaneously on 
events involving a cognitive reorganization, a situation was pro- 
vided for those who had the experiences to make such a report. 
Others who were familiar with all the activities involved but with- 
out the experiences with the work-study skills would be given a 
similar opportunity. The technique of measurement chosen was 
the pupil autobiography. The hypothesis, stated positively, was 
that the pupils who had had them, if the experiences were internal- 
ized, would reveal the fact through the number of items relating 
to the experiences appearing spontaneously in their autobiogra- 
phies. An initial autobiography, an intervening period of training, 
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and an added ‘chapter’ called for at the conclusion of the experi- 
ment should show significant differences. The items mentioned by 
fourth-grade children in their spontaneous autobiographies should 
also shed light on some of the problems hindering their adjustment 
and personality development and also reveal differences between 
boys or girls in this respect. 

If the results were statistically significant, a trend would be 
shown which might aid in reducing the problems involved in teach- 
ing work-study skills. This becomes a factor of great importance, 
and the manner in which these skills are presented depends upon 
the techniques of informing, instructing, and teaching the child, 
as well as the manner in which the child gives back the information. 
This is based upon the assumption that the skill has reached the 
realm of conceptual reality as both the teacher’s purpose and the 
child’s purpose are consummated. 


PROCEDURE 


The research design for this study involved the use of the 
matched-group technique. An experimental group consisting of 
eighteen boys and twenty-five girls and, in a different school, a 
control group of the same number of boys and girls were formed by 
the matching process. This made a total of eighty-six children— 
thirty-six boys and fifty girls. These groups were matched on the 
basis of the following criteria: Age in months; intelligence quotients 
derived from the New California Short Form Test of Mental Ma- 
turity, Elementary ’47, S-Form; the raw reading scores obtained 
from the Iowa Every-Pupil Test of Basic Skills, Test A, Form M; 
and sex. The method described by Peters and Von Voorhis (2) 
was used with these criteria in the matching process. A comparison 
of the mean and the standard deviation in each of the three criterion 
areas in the experimental and control groups, as given below, re- 























Experimental Group Control Group 
Criteria : 
M SD M SD 
Age in months 116 7.05 115 9.78 
Raw reading score 60 16.9 57 17.5 
IQ 103 16.0 105 13.0 
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veals the accuracy of this matching process. The study was begun 
in September, 1949, and extended for nine months, ending in May, 
1950. 

In September, 1949, after the testing program and the equating 
process was completed, each child was asked to write his auto- 
biography without using a guide or outline. Each autobiography 
was carefully read and all items mentioned by the children were 
tabulated. These items were categorized into ten major groups 
with the frequency of mention as follows: Favorite subjects (123); 
Liking for teacher (53); Parts of school building (30); Schools 
attended (25); Best liked books (17); Liking for schoolmates (17); 
Test items (14); Difficult subjects (9); Liking for flag (4); and 
Mother’s criticism (1). As a result, the eighty-six children in the 
initial autobiographies mentioned a total of 293 items or an average 
of 3.5 items per child. The eighteen boys of the experimental group 
mentioned 71*! items, or twenty-four per cent of the total, while 
twenty-five girls of the same group mentioned 80* items or twenty- 
seven per cent of the total. In the control group the eighteen 
boys listed 50* items or seventeen per cent while the girls names 92* 
items or thirty-one per cent of the total. Percentagewise, very little 
difference appeared between the equated groups. The thirty-six 
boys mentioned 121* items or forty-one per cent of the total while 
the fifty girls mentioned 172* items or fifty-nine per cent of the 
total. 

Besides the autobiographies the eighty-six children involved in 
the study were given the Iowa Every-Pupil Test of Basic Skills, 
Test B, Form M, in the fall and Form N again at the conclusion 
of the study to further measure the difference in achievement as a 
result of the year’s work. 

An intensive program of instruction was begun in the area of 
work-study skills in September, 1949. Twenty-three units of work 
were presented to the experimental group while their control coun- 
terparts were not exposed to this intense program. Each unit was 
carefully and thoroughly presented to the experimental group by 
the librarian. The teacher of the experimental group then utilized 
the content of the units and correlated it with all the subjects of 
the school curriculum. A careful record was kept of the number of 





1 The asterisks here indicate frequencies of mention for which reliabilities 
were found. 
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times each unit was correlated with the curriculum. The twenty- 
three units included such topics as parts of a book; index; encyclo- 
pedia, both general and special; dictionary; atlas; World Almanac; 
features common to all reference books; outlining, including general 
technique as well as outlining from visual and auditory perception; 
and reading with emphasis upon various types applicable to dif- 
ferent purposes of the reader, such as comprehension and skimming. 

In May, 1950, after nine months of intensive work, the initial 


TABLE I.—NUMBER AND PER CENT or ITEMS MENTIONED IN THE ADDED 
CHAPTERS OF THE AUTOBIOGRAPHIES BY E1GHTy-S1x CHILDREN IN 
GrRapbDE IV In THE EXPERIMENTAL AND CONTROL GROUPS BY SEXES. 






























































Sprina 1950 

— Control Group Total 
Cotaguries of Stems Mentioned 18 Boys | 25 Girls 18 Boys| 25 Girls| 36 Boys | 50 Girls 
N | %| N |%| N |%| N |% N |% N |% 

} 

— —|j}— + — — - - 
Work study skills 74 35)114 54) 6 | 3/16 | 8 80 38130 62 
Likes teacher 7 |17| 12 me 7 |17,15 |37| 14 (34) 27 (66 
Favorite subjects 7 10 11 15/32 14 23 31 39 53) 34 |47 
Countries and regions studied 8 |42) 6 |32) 1 | 5| 4 |21) 9 /47| 10 |53 
Difficult subjects 3 43) 4 |57 3/43) 4 57 
Parts of building 1 |20 2 |40) 1 20 1 |20) 2 40 3 (60 
Secondary items 5 21 4 |17| 6 \25 Y 37 11 |46, 13 (54 
Total 105* 23)153° 40|53*/14 68* 18 158" 42'221*|58 





* Indicates frequencies of mention for which reliabilities were found. 


autobiographies were returned to the children with the simple 
request to add a chapter to the original. Again, the added chapters 
were carefully examined and the various items tabulated. Table I 
categorizes the children’s responses according to the same pattern 
followed in the initial autobiography. 

The terms comprising the first category (work study skills) as 
given by the experimental group in Table I will be listed in de- 
scending order, with the frequency of mention given after each 
term, since our major hypothesis is the formation of concepts of 
work-study skills. These terms are: Dictionary (22); Encyclopedia 
(21); Trip to city library (19); Likes school library (12); How to 











Concept Formation of Work-Study Skills 261 


study (11); World Almanac (11); Atlas (10); Maps (9); Index (5); 
Card catalog (5); Globe (4); Movie on how to study (4); and 
Learned more than ever before (3). The remaining fifty-two items 
mentioned could be considered adjustive in nature because such 
items as likes school and happy in grade, lots of fun, likes work of 
grade, exciting year, enjoyed extra hard work, likes classmates, 
year passed by fast, etc., indicate that the social climate of the 
classroom brought about by the year’s work impressed the children 
considerably. 

The effect of this intensive work during the nine months is 
revealed by the fact that, unsolicited, these children in the experi- 
mental situation listed twenty-seven different terms one hundred 
eighty-eight times in the category of work-study skills, or seventy- 
three per cent of all items mentioned in the added chapter by this 
group. This definitely demonstrated that these items became a part 
of the child’s thinking since he was able to recall and use these 
terms. The children in this group average 6.0 items per pupil on 
the added chapter. 

In contrast, the control group mentioned twenty-two items defi- 
nitely classified in the work-study skills category. These were eleven 
per cent of all items mentioned in the added chapter. The average 
number of items mentioned was 2.8. Aside from the work-study 
skills items, all other items mentioned in the other categories follow 
the general pattern revealed in the initial autobiographies. 

Considering the total pattern of Table I, the girls again men- 
tioned a greater percentage of the items than did the boys—fifty- 
eight per cent and forty-two per cent, respectively. The added 
chapters revealed certain curriculum areas dwelt upon during the 
year which were new to the fourth-grade children. The experimental 
group mentioned more areas more often than the control group, 
and the children mentioned their favorite subjects more often than 
their most difficult subjects. 

At the close of the experiment the Iowa Every-Pupil Test of 
Basic Skills, (Test B, Form N) was again administered for the 
purpose of checking the results obtained in Work-Study Skills 
with a standardized instrument. These results, compared with those 
of the autobiographies, should throw additional light upon the 
significance of the results obtained. Table II gives the mean results 
of this test in the fall of 1949 and the spring of 1950. 

Many interesting trends are revealed by a study of the compara- 
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tive results of the initial and final testing program given in this 
table. The total picture favored the experimental boys over those 
of the control by .9 of a year, whereas the experimental girls sur- 
passed those of the control by .8 of a year. The greatest growth 


TaBLeE II.—Mean Gains iN GrRapE EQuivALents MADE IN SEPTEMBER, 


1949, anp May, 1950, on Test or Worxk-Strupy SKILLS By 
EXPERIMENTAL AND ContTrROL Groups BY SEXES 





Test Areas and Date of Testing 


Experimental Group 








Boys Girls 
Map Reading 
September 1949 3.9 3.3 
May 1950 4.9 4.5 
Gain 1.0 1.2 
Use of References 
September 1949 3.1 3.3 
May 1950 5.8 5.0 
Gain 2.7 1.7 
Use of Dictionary 
September 1949 3.5 4.0 
May 1950 5.0 5.0 
Gain 1.5 1.0 
Use of Index 
September 1949 3.2 3.5 
May 1950 4.9 5.0 
Gain 1.7 1.5 
Alphabetization 
September 1949 3.3 3.3 
May 1950 6.2 5.7 
Gain 2.9 2.4 
Total 
September 1949 | 3.5 3.4 
May 1950 | §.3 5.1 
Gain | 1.8 1.7 




















Control Group 
Boys Girls 
3.5 3.3 
5.0 4.3 
1.5 1.0 
3.7 3.5 
3.8 4.0 

1 5 
3.5 3.5 
4.4 4.5 

9 1.0 
3.2 3.5 
4.5 4.4 
1.3 9 
3.4 3.0 
4.2 4.4 

8 1.4 
3.5 3.4 
4.4 4.3 

9 9 





occurred in the area of Use of References, followed closely by 
Alphabetization, Use of Index, and Use of the Dictionary. In Map 
Reading the experimental group was surpassed by the control 
group. 

Considering 4.9 as the norm for the May, 1950, testing program, 
the experimental group surpassed this norm in all areas except 
that of Map Reading. In this area the girls fell below the norm. 
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The control group was below this norm in all areas except in Map 
Reading where the boys were one month ahead of the experi- 
mental. The above results of this standardized measure are highly 
significant, making further discussion unnecessary. 

In order to ascertain whether or not the results of the auto- 
biographies are significant as far as the work-study skills are con- 
cerned, Chi-square was used to test the null hypothesis, which 
assumes that there is no difference between the two populations 
and that there is no difference between the responses of the boys 
and girls in the selected sample. A contingency table was set up 
for both the boys and girls separately and in toto involving two 
columns consisting of the experimental and control results, and 
four rows including the initial and added chapters of the auto- 
biography items. The formula shown by McNemar (/) was used 
to determine the value of X?. 

This formula was applied to the tabulated results found in Table 
I for the added chapter and a breakdown of the results given above 
for the initial chapter. The numbers above, identified with an 
asterisk, were used in this formula. It was found that the total 
results were highly significant, with X? yielding a p much less than 
.001, and the null hypothesis was rejected, indicating that a real 
difference exists in these experimental findings. As far as the girls 
were concerned X? yielded a p much less than .001, indicating a 
significant difference. Likewise, the null hypothesis was rejected. 
The results for the boys yielded a p of approximately .60 and the 
null hypothesis was not rejected. All in all, this indicated a real 
sex difference existed favoring the girls. 


RESULTS 


From the above data the following results can justifiably be 
listed : 

1) The number of items mentioned by the experimental group 
in both the initial and added autobiographies, disregarding sex, 
totaled four hundred nine or sixty-one per cent as compared with 
the two hundred sixty-three items or thirty-nine per cent men- 
tioned by the control group. This difference is significant at less 
than the one per cent level of confidence as revealed by Chi-square 
and can be attributed to the year’s intensive work in the work- 
study skills area. 
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2) Of the total six hundred seventy-two items mentioned in both 
the initial and added phases of the experiment, the girls mentioned 
three hundred ninety-three items or fifty-eight per cent of the total 
as compared with the two hundred seventy-nine items or forty-two 
per cent for the boys. Chi-square showed the results for the girls 
to be highly significant with a p of less than .001 while the results 
for the boys yielded a p of .60. 

3) A decided sex difference occurred in this experiment favoring 
the girls from the standpoint of items mentioned in the auto- 
biographies. 

4) In the initial autobiographies the experimental group aver- 
aged 3.51 items per pupil and the control group averaged 3.30 
items per pupil. However, in the added chapters the forty-three 
children in the experimental group averaged 6.0 items per pupil 
compared to an average of 2.81 items per pupil in the control group. 
This indicates a definite trend favoring the experimental pro- 
cedures. 

5) The average number of items mentioned by the experimental 
group in both the initial and added chapter was 9.51 as compared 
with 6.1 for the control group. The average number of items men- 
tioned by all the eighty-six children was 7.81. 

6) The experimental group evidenced desirable concept forma- 
tion of work-study skills in listing one hundred eighty-eight terms 
in their added chapter which was seventy-three per cent of the 
total items mentioned by that group. On the other hand, the 
control group listed twenty-two items or eighteen per cent of the 
total mentioned by them. This great difference seems clearly to 
be due to the intensified work in the area of work-study skills and 
the corresponding concept formations. 

7) The fifty-two items classified as adjustive in nature definitely 
relate to the social climate of the classroom. These adjustive terms 
pertain to the personality development of the children and those 
possessing these concepts were, without a doubt, definitely im- 
pressed and better adjusted. 

8) A comparison of differences in gains in grade equivalents 
between the experimental and control groups indicates that the 
greatest growth occurred in the area of use of references, followed 
by alphabetization, use of index, and use of dictionary. All of these 
areas received special emphasis and the gains further indicate that 
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the experimental procedures aided in the formation of desirable 
concepts in the skill areas as postulated. 

9) In the re-testing both boys and girls of the experimental 
group equaled or exceeded the norm of 4.9 in all areas except that 
of map reading. The boys and girls of the control group were below 
the grade equivalent norm of 4.9 in all areas of the test except map 
reading. Again the major hypothesis is satisfied. 

10) In the fall testing of work-study skills all children fell below 
the grade equivalent norm of 4.0 except the experimental girls in 
the use of dictionary area. 

11) The multi-sensory experiences used here motivated the ac- 
quisition of the concepts of work-study skills through use of the 
concrete approach. 

12) What has been demonstrated as occurring during the nine 
months of this study attests to the postulated major hypothesis 
concerning concept formation of work-study skills. 


CONCLUSIONS 


Autobiographies, as used here, definitely reveal satisfactory evi- 
dence that this concept formation is more evident among the girls 
than the boys. Expecially is this true when they are written without 
benefit of an outline or guide. Furthermore, autobiographies furnish 
a great amount of information and proper dissemination of this 
information can do much to indicate and eliminate maladjust- 
ments. The autobiography is a tool whose potential worth in the 
area of child development has been insufficiently explored, but 
the spontaneity of terms given by the children and the indication 
of adequate conceptual formation of skills being explored make 
this approach highly significant in measuring the effects of learning 
experiences upon individuals over an interval of time. 
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THE DEVELOPMENT OF A TEST TO MEASURE 
THE INTENSITY OF VALUES 


JOSEPH E. SHORR 


Los Angeles, Calif. 


Empirical evidence of the importance of value systems as an 
organizing and motivating factor in behavior has been well ac- 
cepted. Following its development in the early thirties, Allport and 
Vernon’s Study of Values Test (22) has been the main instrument 
used to measure six systems or patterns of values; namely, the 
Theoretical, Social, Political, Economic, Aesthetic, and Religious. 
Since that time almost fifty articles have been published showing 
the importance and stability of the value concept. 

The Allport and Vernon Study of Values utilizes relative scales 
of a forced-choice type and this results automatically for some of 
the six scores to be high and some of the scores to be low. A higher 
score on one type of value makes for a lower score on some other 
type of value. However, it is conceivable that an individual may 
be actually low on all the scales or actually high on all the scales 
or possibly medium on all the scales or other various combinations. 
Moreover, more careful examination revealed that a beginning 
student in college physics may often secure approximately the 
same raw score on the Theoretical scale as a physicist who was 
keenly interested in his field. Furthermore, people who were not 
intensely interested in Aesthetics would sometimes approximate 
in score those strongly interested in Aesthetics. In a similar manner 
the Social, Political, and Religious scales tended not to differentiate 
those who valued quite strongly from those who had better than 
average interest and scored equally high because of the relative 
scales. It appears that the higher and lower extremes of a repre- 
sentative sample were not differentiated and that an ‘artifical’ inter- 
dependency of each score upon each other existed. 

Allport, Vernon and Lindzey (23) have the following to say in 
the newly revised edition of the Study of Values: ‘In interpreting 
the results, therefore it is necessary to bear in mind that they reveal 
only the relative importance of each of the six values in a given 
personality, not the total amount of ‘value energy’ or drive posses- 
sed by an individual. It is quite possible for the highest value of a 
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generally apathetic person to be less intense and effective than the 
lowest value of a person in whom all values are prominent and 
dynamic.” 

It was felt that this defect could be avoided by measuring the 
intensity of each scale of values from a practicable minimum of in- 
tensity to a practicable maximum of intensity. 


PROCEDURE 


__In order to make a new scale that would encompass the entire 
range of value-interest intensity as nearly as can be approximated, 
the following steps were taken into consideration: 

1) A broad matrix of items were gathered ranging from, for ex- 
ample, “‘Avoid social contacts,” to ‘“‘Work with labor and manage- 
ment to help solve their conflicts,” in the Social Scale with suffi- 
cient items to cover about nine intermediary steps of intensity. 
In addition, one hundred and fourteen items that in the opinion 
of the author showed various degrees of intensity in Theoretical 
interests were gathered and collected on cards. One hundred and 
thirty-one Social-value questions of different intensities were so 
secured. One hundred and thirty Aesthetic items of various in- 
tensities, and one hundred ninety-two Economic-Political items 
varying in intensity were also typed onto cards. 

2) Because of empirical usage it was decided that four of the 
six scales be retained and that, in doing so, the Economic and 
Political scales should be combined. It was felt that the Religious 
scale could be eliminated because as Super (21) says, ““The religious 
values scores do not, in some cases represent more than the lip 
service of immature persons who have as yet experienced neither 
deep religious feeling nor intellectual doubts concerning religion.” 

3) The items for each scale were then rated on an eleven-point 
scale ranging from a negative avoidance level to a level of maximum 
intensity. The technique employed was essentially the Thurstone 
equal appearing method of scale construction.The items were rated 
by eleven raters, all of whom were familar with value theory and 
value tests. They were given no other instructions but to rate them 
in intensity from low to high on an eleven-point scale. Each scale 
was described to them by prepared statement as follows: 


Theoretical.—A high score indicates that the individual prefers and con- 
siders most worth while those activities which involve a problem-solving 
attitude and are related to investigation, research, and scientific curiosity. 
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Economic-Political—A high score indicates that an individual prefers 
and considers most worth while those activities which involve the accumu- 
lation of money and the securing of executive power. 

Aesthetic.—A high score indicates that an individual prefers and considers 
most worth while those activities which involve art, music, dance and 


literature. 
Social.—A high score indicates that an individual prefers and considers 


most worth while those activities which involve service and help to people, 
and which exhibit a definite desire to respond and be with people socially. 


4.—Two items per scale value from one to eleven were chosen 
to represent the test for each of the four values. To avoid the 
confusion that the negative and avoidance questions in a prelim- 
inary tryout among students received, the negative questions 
were eliminated. Finally there remained twenty questions per 
value scale or eighty items in all. 

The items finally selected as they appear in the test with the 
median scale value and the inter-Quartile deviation are as follows: 


Median 
Weighted Scale Inter-Quartile 
Score Value Theoretical Items Deviation 


10 1.00 Develop new mathematical formulasforresearch ( .66) 
1.50 Do research on the relation of brain waves to 


thinking. (1.12) 
9 2.67 Study the various methods used in scientific in- 
vestigations. (1.08) 
2.00 Develop improved procedures in a scientific ex- 
periment. (1.00) 
8 3.00 Doanexperiment with the muscle and nerve of a 
frog. ( .87) 
3.25 Solve knotty legal problems. ( .75) 
7 4.00 Make an international language. (1.50) 
4.50 Develop new kinds of flowers in a small green- 
house. ( .87) 
6 5.00 Bea scientific farmer. (1.37) 
5.40 Do algebra problems. (1.16) 
5 6.00 Bea laboratory technician. (1.00) 
6.00 Collect specimens of small animals for a zoo or 
museum, (1.50) 


4 7.00 Visit a research laboratory in which small ani- 
mals are being tested in a maze. ( .80) 








Weighted 
Score 


10 





9.75 
9.25 


Median 
Scale 
Value 


1.00 
1.00 


2.00 
2.00 


3.00 
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Look at the displays on astronomy in an observa- 
tory exhibit 


Visit the fossil display at a museum. 
Plan the defense and offense you are to use before 
a tennis game. 


Keep a chemical storeroom or physical labora- 
tory in order. 
Read the biography of Louis Pasteur. 


See moving pictures in which scientists are heroes 
Sell scientific books. 
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( .96) 
(1.12) 
(1.37) 
(1.00) 
(1.12) 


( .75) 
( .83) 


Inter-Quartile 


Economic-Political Items 


Own and operate a bank. 
Become a U. 8. Senator. 


Run for political office. 
Operate a race track. 


Borrow money in order to ‘‘put over”? a business 
deal. 
Address a political convention. 


Buy a run-down business and make it grow. 
Be an active member of a political group. 


Be a chairman of an organizing committee. 
Plan business and commercial investments. 


Lead a round-table discussion. 
Install improved office procedures in a big busi- 
ness. 


Be a bank teller. 
Purchase supplies for a picnic. 


Take a course in business English. 
Live in a large city rather than a small town. 


Major in commercial subjects in school. 
Work at an information desk. 


Collect luncheon money at the end of a school 
cafeteria line. 
Be a private secretary. 


Deviation 


(1.00) 
( .79) 


( .33) 
(1.25) 
(1.35) 
(1.05) 


( .75) 
(1.21) 


( .65) 
(1.35) 


(1.50) 
(1.21) 


(1.37) 
(1.16) 


(1.42) 
(1.37) 


( .83) 
( .67) 


( .33) 
( .85) 


“A - 
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Median 
Weighted Scale 
Score Value 
10 1.33 
1.00 

9 1.66 
2.33 

8 2.50 
3.00 

7 3.50 
3.33 

6 4.50 
5.00 

5 5.75 
5.75 

4 7.00 
7.25 

3 7.75 
7.50 

2 9.00 
8.50 

1 9.60 
9.50 
Median 
Weighted Scale 
Score Value 
10 1.00 
1.67 

9 1.80 
1.80 

8 3.00 
3.50 
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Inter-Quartile 


Aesthetic Items 


Be a ballet dancer. 
Paint a mural. 


Mould a statue, in clay. 
Write a new arrangement for a musical theme. 


Compare the treatment of a classical work as 
given by two fine musicians. 
Make a comparative study of architecture. 


Participate, in a summer theatre group. 
Be an interior decorator. 


Sketch action scenes on a drawing pad. 
Collect old and rare recordings. 


Judge entries in a photo contest. 
Judge window displays in a contest. 


Be a sign painter. 
Visit a flower show. 


Plant flowers and shrubbery around a home. 
Make and trim household accessories like lamp 
shades, etc. 


Listen to “‘jive’’ and “jazz” records. 
Dance to fast numbers. 


Play the juke box. 
Paint the kitchen white with a red border. 


Deviation 


( .86) 
( .55) 


(1.16) 
( .68) 
(1.37) 
(1.50) 


(1.03) 
(1.25) 


(1.40) 
(1.50) 


(1.50) 
(1.50) 


(1.37) 
(1.16) 


(1.16) 
(1.25) 


(1.33) 
(1.25) 


( .66) 
( .35) 


Inter-Quartile 


Social Items 


Work with labor and management to help solve 
their conflicts. 
Be a medical missionary to a foreign country. 


Work with a group to help the unemployed. 


Help agencies locate living places for evicted 


families. 


Like to be with people despite their physical de- 
formities. 
Treat wounds to help people get well. 


Deviation 


( .00) 
( .96) 


( .73) 
(1.00) 


(1.00) 
(1.25) 
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7 4.00 Serve as a companion to an elderly person. ( .87) 
4.67 Belong to several social agencies. (1.50) 

6 5.00 Take a car load of children for an outing. (1.50) 
5.00 Help people be comfortable when traveling. (1.04) 

5 6.33 Send a letter of condolence to a neighbor (1.25) 
6.00 Meet new people and get acquainted with them. (1.04) 

“4 7.00 Go with friends to a movie. (1.50) 
7.50 Attend a dance. ( .83) 

3 8.00 Help distribute food at a picnic. (1.25) 
7.75 Dine with class-mates in the school cafeteria. (1.16) 

2 9.00 Play checkers with members of your family. (1.16) 
9.00 Play checkers. ( .87) 

1 10.12 Make a phone call for movie reservations. ( .83) 
9.67 Ride in a bus to San Francisco. ( .91) 


In the actual test all items whose median scale-value ranged 
from 1.00 to 1.75 were considered as having a 1 value. The same 
procedure was applied to all the items so that an activity whose 
scale value was 6.33 was considered a 6 item, whereas a 6.89 was 
considered a 7 item. Moreover, for certain scoring reasons, those 
items having a 1 value were given a 10 weighted score, those items 
having a 2 value were given a 9 weighted score, etc. In short, those 
people who had the greatest intensity would score the most points. 
This reversal was convenient and in no way affected the efficiency 
of the scale. 


RESULTS 


The test or scale was tried out on 389 females and 352 males. 
Of the 389 females, 126 were college sophomores and 263 were 
high-school seniors. Of the 352 males, 121 were college sophomores, 
and 231 were high-school seniors. Separate norms were kept for 
each sex and educational group. While sex differences were found 
on the four value scales, no significant differences between the high- 
school seniors and college sophomores were obtained on any of the 
four scales. For all practical purposes, pooling of the high-school 
seniors and college sophomores scores for each sex resulted in 
usable general norms. 
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' The reliability of each of the scales was computed by the split- 
half technique (3). Each scale was scored for each half of the 
weighted items that make up a scale. In other words, since there 
are two items in each scale that have a weighted value of 10, 9, 
8, 7, 6, 5, 4, 3, 2, and 1, the test was scored for each half of all 
the weighted items and the reliability between the halves was 
computed. The following reliability coefficients based upon one 
hundred twenty-six female college sophomores were obtained: .84 
for the Theoretical Scale, .82 for the Aesthetic Scale, .78 for the 
Economic-Political Scale and .72 for the Social Scale. The median 
age of this group was 18.31 with a range of 16 to 51. 

The Lewerenz formula (9) to determine a reading grade level 
applied to the items, yielded a 7.8 grade level of difficulty. Despite 
the low ‘average’ reading level, however, inspection of the test 
indicates several words which may be regarded as above average. 
However, this is to be expected where the upper value intensities 
are to be canvassed. 

One of the difficulties in using the Allport-Vernon Study of 
Values with the general population has been the high vocabulary 
grade level. Stefflre (20) found an 11.3 grade level for the old test. 
The revised edition has made an attempt to lower the vocabulary 
level of the test. The Lewerenz formula applied to the revised 
edition yielded a 10.96 grade level. This is only somewhat better. 

Certain sex differences are in evidence. The median scores on 
each of the four scales for the two sexes are as follows: Theoretical 
Scale, men 45.5, women 32.0; Economic-Political Scale, men 49.0, 
women 32.5; Aesthetic Scale, men 38.5, women 62.0; and the 
Social Scale, men 56.0, women 71.5. 

Further research is now being carried out to secure data on 
various occupations ranging from the Professional to the Unskilled 


group. 
SUMMARY 


In order to construct scales to measure the intensity of ‘drive’ 
of value activities as compared to the Allport-Vernon-Lindzey 
forced choice type of scale, hundreds of items were rated on a scale 
of intensity from lowest to highest by raters with the items of 
minimum variability finally selected to be used in each of four 
scales. Four scales resulted; namely, the Theoretical, Economic- 
Political, Aesthetic and Social and norms were secured on 352 
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males and 389 females. Fairly high reliabilities were secured. Sex 
differences were found on each of the four scales. 

Finally it was pointed out that research is in progress to compare 
various occupational groups as to value intensity. 
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ON THE DESIGN OF GROUPING PROBLEMS 
AND RELATED INTELLIGENCE TESTS 


KARL MENGER 


Illinois Institute of Technology 


Part I of the present paper contains a critical analysis of a 
particular test in Figure Grouping. In Part II, a general method 
is developed by which correct tests of this type can be constructed. 
Part III contains examples which considerably widen the scope of 
both the critical and the positive part. 


PART I. THE LOGIC OF FIGURE GROUPING 


(1) An example.—An intelligence test administered during the 
war to a large group of college students included a question which, 
as far as I remember, read as follows: 


‘‘Which of the five figures below has a property not possessed by any of 


the other four figures?’’ 


Fia. 1 





Upon an inquiry, the testing agency revealed that the expected 
answer was “The Square’—the square being the only one of the 
five figures which is black. Any other answer was held against the 
intelligence, more specifically, against the reasoning ability, of the 
tested person. 

Wartime duties prevented me from pursuing the matter beyond 
a protest to the testing agency which had no effect. Recently, my 
concern was renewed when it came to my attention that questions 
like the one mentioned are still being used in intelligence tests. 

To begin with the above example, it is clear that each of the 
five figures has a ‘distinctive’ property, that is, a property not 
possessed by any of the other four figures. In fact, for each figure 
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we shall list two properties neither of which can be claimed for 
any of the other four figures. 

(1) The circle is the only one of the five figures which (a) has 
no corners, (b) has infinitely many axes of symmetry. 

(2) The triangle is the only one of the five figures which has (a) 
exactly three corners, (b) has only one axis of symmetry (namely, 
a vertical axis of symmetry). 

(3) The rectangle is the only one of the five figures which (a) 
has equal angles and unequal sides, (b) has exactly two axes of 
symmetry (namely, a horizontal and a vertical axis of symmetry). 

(4) The square is the only one of the five figures which (a) has 
equal sides and equal angles, (b) has four axes of symmetry 
(namely, a vertical, a horizontal and two diagonal axes). 


=t) 


Fia. 2 














(5) The parallelogram is the only one of the five figures which 
has (a) no vertical symmetry, (b) four angles none of which is right. 

Clearly, each of the figures has further distinctive properties. 
For instance, the blackness of the square has not been included in 
the above list. Moreover, the mere property of being a circle is 
distinctive for the first figure; the property of being a triangle, for 
the second; and so on. 

The situation is by no means due to a shortcoming of the par- 
ticular selection of five figures. Consider, for instance, the group of 
three circles and one square in Fig. 2. The tested person might be 
expected, even more strongly than in the first example, to single out 
the square. But nothing is logically wrong with singling out the 
first circle as the only figure whose area exceeds a square inch; 
the second, as the only figure with an area less than one half of a 
square inch; the third as the only circle with an area close to one 
square inch. 

(2) The logical situation.—The logic of the matter can be de- 
scribed as follows: First consider two non-identical objects. By 
Leibnitz’ principle of the identity of indiscernibles, the non-identity 
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of the two objects implies the existence of a property possessed by 
one of them, say the first, and not possessed by the other one. 
Clearly, the negation of this property is a property of the second 
object not possessed by the first. 

Next consider a class of more than two objects no two of which 
are identical. By virtue of Leibnitz’ principle, for each pair of 
objects belonging to the class, there is a property possessed by the 
first but not possessed by the second object of the pair. But the 
makers of tests are interested in properties of an object which are 
distinctive, that is, not possessed by any other object of the class. 
The question arises whether each object has a distinctive property 
(as was the case in the examples mentioned above); or whether 
several objects have distinctive properties while others do not; 
or whether none of the objects has a distinctive property; or 
finally whether exactly one object has a distinctive property. The 
presence of the last situation is implicitly assumed in the test 
questions. While this assumption is often erroneous, the last 
mentioned situation is, at any rate, the one desired by the makers 
of tests. We shall therefore call a class of objects a ‘test class’ if 
exactly one of the objects of the class has a distinctive property. 

If the realm of properties to be taken into consideration is 
limited and specified, then it is easy to show that each of the four 
above mentioned situations can actually arise. Consider, for in- 
stance, three categories: color, form of the contour, and number of 
sides; and in each category two properties: white-black, dotted- 
solid, square-triangular. Suppose that it is explicitly stipulated 
that only the above categories are to be taken into consideration. 
Then consider the following groups of three or four objects. 

Example I. A ‘black’ solid square, a white ‘dotted’ square, and 
a white solid ‘triangle.’ Each figure has a distinctive property in 
single quotes. 

Example IT. A ‘black’ solid ‘triangle’, a white ‘dotted’ square, and 
a white solid square. The first figure has two distinctive properties, 
the second has one, the third has none. 

Example III. A black solid square, a black solid triangle, a white 
solid triangle, and a white solid square. None of the four objects 
has a distinctive property. In this example the contour of the four 
figures has not been varied. The example would remain valid if 
two contours were dotted and two left solid. 

Example IV. A ‘black’ dotted square, a white solid triangle, a 
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white solid square, and a white dotted triangle. Only the first 
object has a distinctive property, namely, blackness. 

Example IV is the only one that admits a unique answer to the 
question “‘Which object has a property not possessed by the other 
two?” It represents the only test class. But even in this example 
one has to limit and specify the properties to be taken into con- 
sideration in order to be able to say that the first object is the 
only one with a distinctive property. The first object may cease 
to have this distinction 

a) if properties other than those specified (such as position 
relative to the margin of the page or order of the objects) are 
taken into consideration (for instance, the second object might be 


See & ae al 
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characterized by the property of being immediately to the left of 
the vertical median line of the page); 

b) if different specified properties (such as color and shape) are 
combined into one (for instance, the fourth object might be charac- 
terized as the only white square! in the group). 

To my knowledge, however, in intelligence tests the realm of 
properties to be taken into consideration is not usually limited 
and specified. 

As an empirical extension of Leibnitz’ logical principle, the fol- 
lowing law might be formulated: In every finite class of objects 
no two of which are identical, each object has a distinctive property’. 

The reader may apply the above ideas to the two examples 
taken from recent tests in Figure Grouping. (Fig. 3) 





1 As Dr. J. K. Senior pointed out, language habits have to be taken into 
consideration in formulating grouping tests. If in Example IV, squares and 
triangles are replaced by horses and cows, the German language has, indeed, 
a single word for ‘‘white horse’’ (Schimmel). 
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PART II. A METHOD OF CONSTRUCTING CORRECT GROUPING TESTS 


(3) The construction of test classes of five.—If the realm of prop- 
erties to be taken into consideration is limited and specified, then 
the construction of five objects O:, O2, O3, Ox, Os in which exactly 
one object has a distinctive property is a simple combinatorial 
problem. 

First consider five categories each containing two properties 
(such as white-black, dotted-solid) or, as we shall say, briefly, 
‘binary categories.’ We shall indicate each category by a Roman 
numeral. In each category, we denote one property by +, the 
other by —. If O, is the only object with a distinctive property; 
namely, the property + of category I, then Os, O3, O., and O; 
must have the property — of category I. Besides, we shall let 
each of the objects O2, O3, O., Os share with O, the property — of 
one category to keep them from having distinctive properties. This 
leads to the following scheme: 


I II Ill IV Vv 
O:: + ~ _ _ - 
Or: - ~ + + + 
Os: ~ + = v + 
O,: = + + — + 
Os: - - + + - 


Of course, the objects, the categories, and the properties may 
be permuted. 

In a more economical way, a test class of five can be constructed 
by means of three binary categories. Again, let O be the only 
object with a distinctive property; namely, property + of category 
I. The other objects shall, in addition to property — of category 
I, have the following properties: 


Orc: + +; Os: + om ¢ Ox: = +; Os: = 5, 


What properties O,; has in addition to property + of pair I is 
immaterial. 
Suppose, for instance, 
I is type of contour (dotted +, solid — ) 
II is the color (white +, black — ) 
III is the shape (square +, triangular — ). 

In the above class, O; is the only object with a distinctive property 
(dottedness). (Fig. 4) 
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Next consider a ‘ternary’ category, that is, a category containing 
three properties such as white-gray-black. If the three properties 
are distributed among five objects, then at least one object will 
have a distinctive property. For instance, the five objects may 
have the colors white, white, gray, gray, black, in which case the 
last object has the distinctive property of blackness. If the colors 
were chosen according to the scheme white, gray, black, black, 
black, then two objects would have distinctive properties: exactly 
one object would be white, and exactly one would be gray. A test 
class cannot contain two such objects. Moreover, it is clear that if 
two ternary categories are selected, then either again two (or even 
more) objects have distinctive properties (which is forbidden), or 
at least one object has two (or more) distinctive properties. 


rlAMmC = O; 


Fia. 4 


We shall call a test class perfect if the only object possessing a 
distinctive property has only one distinctive property. 

The combination of a ternary category (white-shaded-black) and 
a binary category (square-triangular) leads to the most economical 
construction of a perfect test class of five. The example, Fig. 5, 
includes three squares of all colors and two triangles of different 
colors. The square with the color not matched by a triangle is the 
only object with a distinctive property. 

If categories containing more than three properties are intro- 
duced in a class of five objects, then the existence of exactly one 
object with a distinctive property is ruled out. For if four properties 
are distributed among five objects, then only one property occurs 
twice and at least three objects have distinctive properties. In the 
example mentioned at the beginning of this paper, one of the cate- 
gories; namely shape, contained five properties (circle, triangle, 
rectangle, square, parallelogram). This fact alone gave every object 
a distinctive property. 

We thus arrive at the following rule for the construction of 
perfect test classes of five objects with specified properties to be 
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taken into consideration: Only binary categories (such as white- 
black, solid-dotted) can be used with the possible exception of one 
ternary category (such as white-gray-black). If a ternary category 
is used, then it supplies the distinctive property. 

(4) The construction of perfect test classes of more than five-—Since 
the variety of perfect test classes of five is small, it may be of 
interest to see how larger perfect test classes can be constructed. 


LIB ABA 


Fie. 5 


Perfect test classes of six can be obtained by means of four binary 
categories. Below, we indicate each category by a Roman numeral, 
and the properties of each category by + and —. 


I Il Ill IV 
O1: + + + = 
Or: = + Tv + 
Os: - + _ ss 
O,: 19g ge T + 
Os: - - - + 
Os: - ~ - ~ 


If one ternary category is admitted, then two additional binary 
categories permit the construction. The three properties of the 
ternary category I will be denoted by +, 0, —. 


++ 
_ 
_ 
— 


O:: 
Oz: 
O;: 
O,: 
O;: 
Os: 


ooo! |! 
Iti +tte 
L+++t i 


If two ternary categories are admitted, an example can be ob- 
tained as follows: 


I II 
O;: + —_ 
O2: - + 
O;: = 0 
O,: 0 0 
Os: 0 om 
Og: 0 + 
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Finally, it may be of interest to see test groups of eight and of 
ten objects which can be obtained by means of a quaternary 
category I containing the properties A, B, C, D, and a ternary 
category II containing properties a, b, c. 


Eight Ten 


Objects I II Objects I II 
O; ; A Cc O; $ A a 
Oz: B a Oz: B a 
Os: B b O; r4 B b 
Ox: C a Ox: B c 
Os: C b O;: C a 
Os: D a Os: C b 
O;: D b O;: C c 
Os: D Cc Os: D a 

Os: D b 
On: D Cc 


An example of the last scheme is obtained if category I denotes 
the number of sides of a figure, 

I: 6 sides: A; 5 sides: B; 4 sides: C; 3 sides: D; and category 
II denotes color, 

II: black: a; shaded: b; white: c. 


O; 07 0. 0, 0, 0. 0, 


AL|I@AQ e\on 
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The test class results are shown in Fig. 6. QO, is the only figure 
with a distinctive property (having six sides). 

We conclude with three examples of perfect test classes of twelve. 
Example 1 is based on two quaternary categories: the category I 
the properties of which are denoted by A, B, C, D, and II consisting 
of the properties a, b, c, d. Example 2 is based on the above quater- 
nary category I, a ternary category II’ of properties a, b, c, and a 
binary category III’ of properties a, 8. Example 3 is based on the 
quaternary category I and two binary categories: II” including 
the properties a, b, and III” including the properties a, 8. An 
x in any of the examples indicates that any property of the category 
heading the same column may be substituted. 
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Example 1 Soomro 2 Example 3 
I II I Il’ Ill’ I II” III’ 
O;: A x A x x A x x 
O2: B 4 B 4 x B a a 
O;: B b B b x B a B 
O,: B c B c x B b x 
Os: C a C a x C a a 
Os: Cc b C b x C a B 
O,: C c C c a C b a 
Os: C d C c B C b B 
' Os: D a D a x D a a 
Ov: D b D b x D a B 
O;, : D c D c a D b a 
Or: D d D c B D b B 


It goes without saying that in all these examples further cate- 
gories might be added which do not supply distinctive properties. 
The examples given in this section attain their respective purposes 
with minimum numbers of categories. 


PART III. APPLICATIONS AND CONCLUSIONS 


(5) Applications to other types of tests—It is clear that the ideas 
developed in both preceding parts apply to numerous types of tests 
other than Figure Grouping. 

A recent test in Letter Grouping includes the four groups 


AABC ACAD ACFH AACG 


“Three of the groups are alike in some way. . . Mark the one that 
is different.’”’ The testee is supposed to mark the third group, 
ACFH, because it is the only one which does not have two A’s. 
But, clearly, the second group, ACAD, is the only one in which 
the letters are not in alphabetical order, and also the only one 
containing the letter D; the first group, AABC, is the only one 
containing only the first three letters of the alphabet, and also 
the only one in which any two consecutive letters are neighbors 
in the alphabet; the fourth group, AACG, is the only one contain- 
ing two consecutive letters which determine an interval of three 
letters between them in the alphabet (C ...G), and also the only 
one containing the letter G. 

In most tests in Word Grouping the situation is, practically 
speaking, less ambiguous although, theoretically, the differences 
between different types of grouping are not essential. Also with 





bbe tO Fz 


té 


€ 
ss 


em 
iz? 


Ai 


‘J. ebay fat *i7 _ 
Chie ttm Balms 


oe Fy 


BR itn Z 


#288 . 


+ é 


* go 


Tie 
= 


Ler 





284 The Journal of Educational Psychology 


regard to words, a unique solution can be expected only if the 
properties to be considered are limited and specified, and a test 
class is constructed by the method explained in Part II. Otherwise 
there is the danger of a plurality of reasonable solutions due to 
hidden relations which can easily be overlooked in formulating 
the test. 

Suppose a child is asked to underline the word not belonging in 


the group 
apple, pear, strawberry, cloud, plum. 


The boy underlining strawberry, because it is the only one of the 
five objects to which he usually has to look down, may be the 
potential genius in the group of tested children. 

The principles developed in the preceding parts apply also to 
tests in Eduction of Relations. Problems of this kind are included 
in reasoning tests as well as in tests concerning specific knowledge. 
A fictitious example will illustrate the point. 


van Gogh: van Beethoven = Raphael: X. 
Suppose the following four choices are given for X: 
a) Milton; b) Galileo; c) Mozart; d) Chaucer. 


In favor of b) it might be said that Beethoven was of the same 
(Flamish-Dutch) extraction as van Gogh, and that only Galileo 
was of the same (Italian) extraction as Raphael. In favor of d) it 
might be said that Beethoven lived before van Gogh, and that 
only Chaucer lived before Raphael. c) Mozart is the only composer. 
By choosing Mozart one would, however, emphasize <nalogous 
properties rather than educe identical relations since it is artificial 
to speak of a painter-to-composer relation. One might support even 
the choice a) by pointing out that only Milton had an infirmity 
(blindness) comparable to Beethoven’s deafness. 

The concepts developed in the preceding sections apply also to 
the type of Figure Classification introduced by Spearman. In 
these tests, Group I differs from Group II. Each of the test 
symbols that belong to Group I are to be checked. In the following 
example, the rule which Spearman had in mind, is that the figures 


Group I Group II Test Symbols 
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in Group I are closed, while those in Group II are open. A second 
rule, incidentally leading to the same classification of the four test 
symbols, is the existence of more than one axis of symmetry for 
each figure of Group I while no figure of Group II has two axes of 
symmetry. If the contour of a semi-circle were added as a fifth 
test symbol, then this symbol would belong to Group I by virtue of 
its closedness, but to Group II by virtue of the absence of two 
axes of symmetry. 

. If, in Spearman’s example, the second test symbol is interchanged 
with the second figure of Group II, another ambiguity arises. 


Group I Group II Test Symbols 


The figures in Group I have horizontal symmetry while no figure in 
the new Group II is symmetric about a horizontal axis. Hence the 
vertical semi-circle would belong to Group II because it is open, 
but to Group I because it is horizontally symmetric. 
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Fig. 7 





















































These examples illustrate how the principles developed in the 
preceding section can be applied to the tests of the Spearman type. 
In this connection, a binary property is ‘distinctive’ if (1) it be- 
longs to each figure of one group and (2) it does not belong to any 
figure of the other group. In Spearman’s example, closedness is a 
distinctive property. Another one is the existence of more than one 
axis of symmetry. The property of having corners is not distinctive: 
in Group I, only the second and the third; in Group II, only the 
first and the third figures, have this property. 

To construct a test of the Spearman type, two groups of figures 
differing in several non-distinctive and in at least one distinctive 
property are needed. We shall call the test ‘perfect’ if only one 
distinctive property can be found. The test is ‘ambiguous’ if the 
two groups differ in two distinctive properties and a test symbol 
shares one distinctive property with the figures in Group I, and 
the other distinctive property with the figures in Group II. 

In order to avoid such ambiguities, Spearman introduced figures 
consisting of several components (several lines, several circles, etc.) 
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and used relations between these components (parallelism, prox- 
imity, equality, etc.) as distinctive properties of the figures of the 
two groups. In fact, in most of his excellent examples, the main 
difficulty for the inexperienced is to find even one distinctive 
property. 

(6) A comparison with number series tests —A number series test 
consists of a row of numbers which ‘follow one another according 
to some rule.’”’ The testee is expected to “find the rule and fill in 
the blanks to fit the rule.” For instance, in 2, 4, 6, 8,—he is ex- 
pected to notice that the given series consists of the first four even 
numbers in their natural order and to write 10, the fifth even num- 


ber, in the blank. 
Most mathematicians are opposed to tests of this type for the 
following reason: If a series of k numbers nj, no, ..., Ny is given, 


one can, for any number N, find a rule according to which one 
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should write N in the blank. Hence, whatever number is written 
in the blank, it can be said to fit some rule. 

Yet two important points may be advanced in support of number 
series test. The ability to derive, by induction, a simple rule in a 
short time is valuable in many domains of extra-scientific activity, 
as well as in science and, to a limited extent, even in mathematics. 
The motivation of an unexpected continuation of a given number 
series, as well as the actual operations leading to the unexpected 
number, are in most cases time consuming. For instance, to obtain 
30 as the continuation of 2, 4, 8, 16 certainly takes longer than 
to obtain 32, and to obtain 29 would take still more time. 

If, at the beginning of the test, the testees are told that they 
will be credited with the number of solutions attained within the 
allotted time, then probably few unexpected answers will be sub- 
mitted. It is doubtful whether even students of mathematics would 
turn in many unexpected answers together with the motivating 
law, both obtained within the allotted time. 

In contrast, in the grouping test mentioned at the beginning of 
this paper, it does not take longer to notice that the first figure is 
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the only round one than that the fourth is the only black one. In 
grouping tests the time element is much less important than it is 
in the number series problems. 


GENERAL CONCLUSIONS 


The author believes that ambiguities of the type criticized in 
this paper ought to be carefully avoided in designing tests. If in 
the example previously mentioned the testee is expected to mark 
the square, then he simply is expected to guess what the designer 
of the test happened to have in mind. With the testee’s intelligence, 
in general, and with his reasoning ability, in particular, the expected 
answer has nothing to do. 

Even if there should be a high correlation between the expected 
answer in an ambiguous test and the positive outcome of an inde- 
pendent sound intelligence test, the author would not admit that 
an answer different from the one which happens to be expected, 
supplies any relevant information against the intelligence of the 
testee. 

We should even go so far as to suggest that, on the contrary, 
unexpected groupings may well be a sign of superior intelligence. 
In order to make an answer based on a deliberate unexpected 
grouping distinguishable from a random answer, the testee would, 
of course, have to indicate the motivation or the principle of 
classification which has guided his choice. Unfortunately, this re- 
quirement rules out children, and makes the grading and the 
interpretation of the test difficult. 

If a small number of more mature persons are to be subjected 
to an intelligence test, an ambiguous grouping test may be an ex- 
cellent criterion. It would have to be phrased somewhat like this: 
“Find a group of four (or, if you can, several groups of four) 
among the following five items which have a common property 
not shared by the fifth. In each case state the property.”’ Then 
the items would follow: words, figures, groups of letters, as the 
case may be. 
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COMMENTS ON THE CORRELATIONAL 
ANALYSIS REPORTED IN INTELLIGENCE 
AND CULTURAL DIFFERENCES 


FRED T. TYLER 


University of California 


The fact of a positive correlation between socio-economic 
status or social class and measures of general intelligence has long 
been recognized. Numerous hypotheses about the nature of this 
relationship have been suggested: 

1) Higher test scores among high-status children arise from ge- 
netic differences; 

2) The environment of low-status children produces real inferi- 
ority in their intelligence; 

3) Test materials are biased in favor of high-status groups; 

4) Motivational patterns produce differences in test performance. 
The factors indicated above are not mutually exclusive, but may 
operate together in various combinations. 

A recent volume of Eells et al., Intelligence and Cultural Differ- 
ences, is concerned with a number of problems in this field; only 
one of these will be considered in this paper: the relative bias, in 
different status' groups, of verbal and non-verbal intelligence tests. 


CORRELATIONS BETWEEN STATUS AND IQ 


Eells computed the correlations between the Index of Status 
Characteristics (ISC) and IQ’s obtained from various tests for 
groups of over 2,000 children aged nine and ten years, and thirteen 
and fourteen years. The correlations for the younger group are 


shown in Table I (1, Chapter XIV). 
Eells tested the significance of the difference between various 





1 The definitions and measures of class and status advocated by Warner 
and others are accepted operationally only. The writer believes that these 
are not satisfactory, either conceptually or statistically, to many sociolo- 
gists, psychologists and educators. This does not mean that the writer is 
unsympathetic to or unaware of problems of individual differences con- 
nected with ‘class membership.’ Rather, he believes that there is need for 
careful definition and measurement in an area which has such important 
implications for educational theory and practice. 
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pairs of uncorrected coefficients shown in Table I, reporting only 
one difference to be significant at the .01 level; namely, that in- 
volving the Henmon-Nelson and the Otis Alpha Nonverbal tests. 
He observed that the variations in the sizes of the correlations 
might be caused in part by differences in the reliabilities of the 
tests. He reported the reliability coefficients provided by the au- 
thors of the various tests and commented on the lack of compara- 
bility of these coefficients because of differences in methods of 
cémputing reliability and in ranges of talent investigated by the 
test makers. The reliability coefficients are shown in column 3 of 


Table I. 


TABLE I.—CORRELATIONS AND Test RELIABILITIES FOR 
YOUNGER CHILDREN 




















. R ted R d 

Status and IQ from Cistotions Religbilities Pee on al 
Henmon-Nelson .35 .89 37 
Otis Alpha Verbal 34 71 .40 
Kuhlmann-Anderson .33 _ re: 
Otis Alpha Nonverbal . 28 .68 34 








Eels pointed out that variations in the size of the reliability 
coefficients ‘paralleled’ variations in the size of the ISC-IQ cor- 
relations. The meaning of ‘paralleled’ is to be inferred from a con- 
sideration of the figures in columns 2 and 3 of Table I. However, 
his analysis would have been more complete if the correlations 
had been corrected for attenuation. The corrected coefficients are 
shown in column 4.? The difference between the correlations in- 
volving the Henmon-Nelson and the Otis Alpha Nonverbal Test 
is reduced, leading to some doubt about the existence of real differ- 
ences between correlations of verbal and nonverbal IQ’s with ISC. 
If status differences do arise from some bias in the test, it would 
appear that the bias is not exclusively that of verbal symbolism. 
High-status children perform better on both verbal and nonverbal 
tests; superior performance is more than a matter of symbolism. 





? The reliability of ISC is reported by Warner to be very high. No cor- 
rection was applied for unreliability in measures of status. 
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THE LINEARITY OF REGRESSION 


When the measures of status were plotted against IQ’s obtained 
from the various tests listed in the preceding section, Eells noticed 
that the regression lines were approximately linear for only part of 
the ISC scale. All curves tended to flatten out beyond a status 
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Fig. 1. Mean IQ’s for nine- and ten-year-old pupils of varying social 
status. (Adapted from Eells et al., p. 146.) 
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score of 13, the lower point of what was considered to be the high- 
status range. He proposed several hypotheses in an attempt to ac- 
count for this apparent non-linearity in the data (/ pp. 149-151). 

Two further hypotheses occurred to the writer. In the first place 
the levelling off might be an artifact of the measuring instruments. 
If tests are not of a suitable level of difficulty for the subjects, 
irregularities may occur at those points in the scale where the tests 
are too easy or too difficult. Thus, if they are too easy they will 
not differentiate levels of ability at the upper end of the IQ scale. 
If they are too hard, they may also fail to be discriminatory. Un- 
suitable tests may provide data which do not have a linear relation- 
ship with measures of status. 

Eells commented on the differences in difficulty of the various 
tests and of possible differences in their ceilings. For instance, 
several hundred children had IQ’s over 130 on the Henmon-Nelson 
test, whereas the maximum possible IQ for a ten-year-old on the 
Otis Alpha Verbal test was 128, and for a nine-year-old, 136. Eells 
did not take these factors into account in his discussion of the 
levelling off found at the high-status levels. He also failed to con- 
sider the implications of the differences in mean IQ’s and standard 
deviations of the various tests, as will be indicated later. 

If a large number of subjects ‘break the test,’ it should not be 
surprising to find that there is a change in the direction of the line 
of regression at the point where they can no longer be measured 
because of insufficient ‘top to the test.’ The Henmon-Nelson test 
was designed for Grades III to VIII; the Otis Alpha for Grades I 
to IV; and the Kuhlmann-Anderson for Grades III to VI. The 
mean grade placement for the younger children in this study was 
4.2. One-half of the total possible score on the Otis Alpha Verbal 
test corresponds to a mental age of six and one-half years; on the 
Nonverbal test, to a mental age of eight and one-half years. The 
mean chronological age of these subjects was about ten years. It 
is hypothesized that the lack of linearity in the data at the high 
levels might arise from the level of difficulty of the tests; '.e., non- 
linearity may be an artifact of the tests. 

There is another possibility that should be considered. The IQ 
units for any given test may not be equal at all points of the scale. It 





3 A test is considered most appropriate for a group when the mean score 
of the group is about one-half of the maximum obtainable score. 
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may require more ability to raise the IQ on the Henmon-Nelson 
test from 115 to 120, than from 95 to 100. If this should be so, then 
the curve showing the relationship between status and IQ, when 
each is plotted as if the units were equal, cannot be expected to be 
linear throughout the whole range. It is generally admitted that 
IQ units for any given test are not necessarily equal. This possi- 
bility should, then, be considered in any attempt to account for 
the nature of the curves shown in Figure 1. 

Eells recognized the possibility that the status scale was not 
linear, but after some consideration discarded it as not accounting 
for the facts. However, until it is known that the units for the de- 
pendent and for the independent variables are equal, there is some 
question about the desirability of elaborating hypotheses to ac- 
count for a situation which may not actually exist. 


TaBLE II.—MEANS AND SIGMAS ON VARIOUS INTELLIGENCE TESTS 














Test Mean SD 
Henmon-Nelson 107.2 17.2 
Kuhlmann-Anderson 102.9 11.3 
Otis Alpha Verbal 101.3 10.8 
Otis Alpha Nonverbal 99.9 10.8 





THE REGRESSION LINES INVOLVING VERBAL AND NONVERBAL IQ’S 


The scattergram showing the relationship between status and 
IQ for different intelligence tests revealed that the mean IQ’s of 
different status groups varied markedly from test to test. Eells 
noted that the means at the low-status levels were quite similar 
for both verbal and nonverbal tests. However, change in mean IQ 
with increase in status did not occur at the same ‘rate’ for all tests. 
He observed that the ‘rate of increase’ was greater for the Henmon- 
Nelson than it was for the Otis Nonverbal test. Such an observation 
is consistent with the third hypothesis suggested in the opening 
paragraph of this paper. 

However, this differential ‘rate of increase’ of IQ with increase 
in status may be more apparent than real: The IQ units on different 
tests may not be equal. The mean and sigma values of the nine- 
and ten-year-olds in the study are in Table II; the IQ units are 
not equal from test to test. 
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It may seem from Table II that IQ’s representing a given level 
of ability in this group may vary markedly from test to test. The 
values of IQ’s, for the two tests to which Eells devotes major at- 
tention, at various sigma distances from the respective means are 
shown in Table III. 

It is quite apparent from Table II] that a given IQ does not 
represent the same relative ability on these two tests. A child 
with an IQ of about 111 on the Otis Alpha Nonverbal test stands 
at about the 84th percentile in this group of over 2,000 children. 
On the Henmon-Nelson test a child must have an IQ of about 124 
to occupy the same relative position. 


TABLE III.—IQ’s At CompaRABLE S1GMA PosITIONS 

















Sigma Units from the Mean Henmon-Nelson | Otis Nonverbal Difference 
M — 38D 55.6 | 67.5 —11.9 
M — 2SD 72.8 | 78.3 —5.5 
M — 1SD 90.0 | 89.1 9 
M 107.2 99.9 7.3 
M+1SD | fie | mer | 13.7 
M+2SD oie ge 20.1 

| 158.8 | 132.3 26.5 


M+ 38D 





The mean scores on the Henmon-Nelson and the Otis Alpha 
Nonverbal tests for each of the status groups represented in Eells’ 
Figure 9 (p. 146) were read from the graph and changed to standard 
score form by using the appropriate means and sigmas. The scores 
thus obtained were then plotted with the result shown in Figure 2. 

It now appears that the differences between the mean scores 
expressed in comparable units are quite uniform along the social- 
status scale. It is still apparent that “high-status children receive 
increasingly higher IQ’s on all tests,’’ but not that they do so “at 
a greater rate of increase for the Henmon-Nelson test.’? When the 
IQ’s were changed to standard scores, the differences between the 
regression lines for verbal and nonverbal intelligence are much less 
marked than when IQ units were used for the analysis. This type 
of unit also indicates why it is that Eells was able to remark that 
at the low-status levels all tests yielded similar IQ’s. It is a statisti- 
cal artifact that the mean IQ’s from these tests at the lowest status 
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level shown on Figure 1 have very similar numerical values. These 
means are about 90 and happen to correspond to approximately 
the 16th percentile on both the Henmon-Nelson and the Otis 


Alpha Nonverbal tests. 








60 
55+ 
a ae 
i ae 
B tN 
£ had 
P56 a om @ 
ae Sy wa * 
Par | > 50 
ya ae Pr 
ee | ape 
a “ly oD 
Ioeod shee hod 
ee So 
ey eae 0 
os © 
we * tin 
en a Ss 
2s a 4° 
he, 
par ~ 
jo wl ® 
ia 
oR 40-L 
es / 
yy ; d 
st 
35 " 


Otis Alpha wun fi 
A 


Henmon - Nelson 





i j ] 





25-27 22-24 19-21 16-18 
isc 


Fig. 2. Mean IQ’s for nine- and ten-year-old pupils of varying social 


status expressed in standard score form. 


Note: Standard Score = 50 + ( 


X—-M 
SD 





13-15 


) x10 
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Where X is the mean score for each status interval shown in Figure 1; 
M is the mean of the whole group, and SD is the standard deviation, as 


shown in Table II. 
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CONCLUSION 


Several features of the correlational analysis reported by Eells 
in Intelligence and Cultural Differences have been discussed in con- 
nection with the hypothesis that status differences in IQ’s are a 
function of the nature of the tests, and that these differences are 
produced especially in connection with the verbal factor. A con- 
sideration of such factors as the reliability of the tests, the equality 
of IQ units on a given test and from test to test, and the difficulty 
of the tests, suggested that there is need to reconsider Eells’ con- 
clusion that ‘“There is some definite evidence ...that the chief 
reason for the status differences in IQ’s may be the different oppor- 
tunities which pupils from high- and low-status levels have for 
familiarity with the kinds of cultural materials and processes repre- 
sented by the usual tests” (1, p. 151). Of course, the word ‘may’ in 
this quotation must not be ignored; but the materials reported in 
this paper indicate that the data give little positive support to 
even this tenuous conclusion. 

It is here suggested that this type of analysis cannot provide 
satisfactory evidence on the factors related to status differences in 
IQ’s, and that an experimental approach is necessary. 
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AN EXPERIMENT IN EVALUATION IN 
BIOLOGICAL SCIENCE! 


JOHN M. MASON 


Michigan State College 
and 


GEORGE W. ANGELL 
State University Teachers College 
New Paltz, New York 


The main purposes of this study were: (1) to compare the relative 
effectiveness of two different evaluation programs with respect to 
student achievement in biological science at the end of the first 
term in the course; and (2) to discover if the two evaluation proce- 
dures produced any significant changes in student behavior, such 
as, study habits and reactions to the course. 

Design of the study.—The subjects were one hundred four stu- 
dents enrolled in the first term of biological science at Michigan 
State College in the spring term, 1949. These students constituted 
five laboratory sections and they attended the same lectures. 
Laboratory sections met once each week for a two-hour period 
and two one-hour lectures were given each week. All the students 
in this study were taught in both lecture and laboratory by the 
same instructor. 

The teaching variable was the method used for evaluating stu- 
dent achievement during the term. This variable was established 
and maintained by the design of the study whereby the students in 
specified laboratory sections had one evaluating program and the 
students in the other laboratory sections of the correlated lecture- 
laboratory arrangement had a different evaluation program. 

In initiating the study, the instructor presented the tentative 
plans of the study to the students and the students in each labora- 
tory section then made the final decision as to the evaluation 





1 Contribution No. 51 of the Department of Biological Science, The Basic 
College, Michigan State College, East Lansing, Michigan. 
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program that would be used in that particular laboratory. The 
students in two of the laboratory sections selected one of the two 
evaluation programs that was deemed feasible for the study and 
these students were designated the experimental group. The stu- 
dents in the remaining three laboratory sections selected the other 
evaluation program and they were designated the control group. 
This preliminary planning, discussion, and selection of groups took 
place during the first three weeks of the course. The experiment 
proper began with the fourth week and extended through the 
ninth week of a ten-week term. 

One evaluation program consisted of the administration of a 
weekly objective examination. Each examination was based, as a 
rule, upon the subject matter that had been covered the preceding 
week either in laboratory, in lecture, or in an assignment. Each 
examination was administered during the weekly laboratory period 
and required fifteen to thirty minutes of working time. The stu- 
dents who had selected this evaluation program, termed the re- 
quired weekly test program, constituted the control group. There 
were originally seventy-four students in this group. However, the 
records of four of these students were incomplete and therefore 
were not included in this study. Thus, the control group was com- 
posed of seventy students. 

The other evaluation program was arbitrarily called the self- 
evaluation program and the students using this program consti- 
tuted the experimental group. In this program, the students were 
not required to take any of the weekly tests during the time of the 
experiment. However, the same weekly tests as were given to the 
control group were made available to the students in the experi- 
mental group. Students in the experimental group could take these 
tests either during the laboratory period or at their own conveni- 
ence. A room was provided where they could secure the tests and 
take the tests at anytime during the school day. Keys for the 
tests were available and the mean scores of the control group on 
the various tests were also made available so that a student might 
compare his own achievement with that of the average of the 
control group. 

The experimental group originally had fifty-two students, but 
for various reasons data were complete for only thirty-four stu- 
dents. Eight of the original experimental students accelerated 
during the term and took the Comprehensive Examination rather 
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than the course final examination. This may indicate that the 
experimental group originally had a higher proportion of excellent 
students than did the control group. On the other hand, it may 
also indicate that the final average of the experimental group had 
been depleted by the loss of some of its best students. 

It is to be pointed out that a student’s course grade in the first 
term’s work in biological science at Michigan State College was a 
composite grade based upon an instructor’s grade for the student 
plus the student’s grade on a departmental term-end examination. 
The instructor’s grade constituted forty-nine per cent of the total 
grade and the student’s grade on the term-end examination con- 
stituted fifty-one per cent of the total course grade. In this study, 
the instructor’s grade for the students in the control group was 
determined by the student’s scores on the weekly tests. The in- 
structor’s grade for the students in the experimental group was 
the student’s grade on an instructor-made examination which was 
given to the students in the experimental group the last week of 
the term. Thus, the main difference between the two evaluation 
programs was that the students in the control group were required 
to take a weekly test while the students in the experimental group 
were not required to take weekly tests but had the test available 
for self-testing and self-scoring. These two programs were com- 
pared with respect to their effectiveness as shown by student 
achievement on a departmental term-end examination. 

Hypothesis tested, collection and treatment of data, and results.— 
The hypothesis held for the comparison of the two groups was that 
achievement on the departmental term-end examination is inde- 
pendent of the evaluation method used during the term; that is, 
the mean achievement of the two groups are equal. The technique 
illustrated by Johnson? for analysis of variance and covariance 
with one independent variable was used to test the above hypothe- 
sis. The test of significance used was the F-test. 

The independent variable held constant in the analysis was the 
student’s score on an unannounced instructor-made test which 
was administered during the sixth week of the term to all the 
students. This test consisted of one hundred true-false items. Sixty 
of these items were based on the subject matter covered during the 





2Palmer O. Johnson, Statistical Methods in Research. New York: 
Prentice-Hall, Inc., 1949, pp. 246-255. 
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first five weeks of the term and the remaining forty questions 
were based on material to be studied during the last four weeks in 
the course. The curricular validity of this test was assumed on the 
fact that the items were taken directly from the syllabus used by 
the students. The estimated reliability coefficient of this test was 
81 as determined by the Kuder-Richardson formula (approxima- 
tion).* The respective means and sigmas for the experimental and 
control groups on this test were: means 61 and 59.54; and sigmas 
10.14 and 11.40. 

The departmental term-end examination was assumed to possess 
curricular validity in that it was constructed by a departmental 
committee and had been reviewed and approved by the biological 
science staff. The examination contained seventy-two items and 
the estimated reliability coefficient was .74. The respective mean 
scores and sigmas for the experimental and control groups were: 
means 43.18 and 44.00; and sigmas 7.24 and 7.58. 

The results of the analysis of the data showed that the F-test 
was not significant. Therefore, it was inferred that the difference 
between the mean achievements of the students in the two groups 
on the departmental term-end examination could have been due to 
chance. In other words, it may be inferred that one method of 
evaluation during the term had been just as effective as the other 
method as far as student achievement on the departmental term- 
end examination was concerned. 

Educational implications.—It is recognized that the preceding 
inference needs additional confirmation in other situations, how- 
ever, this finding together with other findings points to several 
important educational implications. Some of these are: 

1) It may be possible to set up self-evaluation procedures which 
are just as effective motivating influences for study and/or teaching 
aids as evaluation programs which require students to take weekly 
or frequent examinations. 

2) Self-evaluation needs to be sought through instruction as 
does any other desirable educational objective. In this study, data 
were collected with respect to the use made of the weekly tests by 
the students in the experimental group. The per cent of students in 
this group that made use of these weekly tests for the six weeks of 





’ Henry E. Garrett, Statistics in Psychology and Education. New York: 
Longmans, Green and Co., 1947, p. 385. 
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the experiment beginning with the first test were 89.3, 80.8, 59.5, 
46.8, 40.4 and 21.2 respectively for each weekly test. These per- 
centages conform fairly close to the instructor’s opinion as to his 
own efforts to motivate the students into making their own evalua- 
tions. It seems justifiable to state that the majority of college 
students will not assume self-responsibility for evaluation to the 
degree generally sought by college faculties without proper stimu- 
lation and guidance. 

3) It is possible to develop self-evaluation programs whereby 
class time, which under the teacher testing program was required 
for testing, could be used for other activities. 

4) A self-evaluation program, once in operation, could free 
teachers from many routine tasks thereby eliminating some of the 
time consuming operations that in themselves add nothing to 
teacher efficiency. 

Reported changes in behavior.—In order to discover if the two 
evaluation procedures produced any significant changes in student 
behavior, an unsigned questionnaire was used to secure the data 
with respect to this purpose of the study. This questionnaire was 
administered to all students in the study during the last week of 
the term. The questionnaire was as follows: 


BIOLOGICAL SCIENCE DEPARTMENT OF THE BASIC COLLEGE 
MICHIGAN STATE COLLEGE 


Basic 121 
Unsigned Student Questionnaire 


Instructions: 


If you desire to answer no to the statement, mark space 1. 
If you desire to answer yes to the statement, mark space 2. 
If you feel that the statement does not apply to your section, mark 
space 3. 
If you do not feel that you have sufficient information to give a justi- 
fiable answer to the question, mark space 4. 
1) Did you have a planned study schedule which includes a time for the 
studying of biological science? 
2) Did you keep regular hours for the studying of biological science? 
3) Did you keep your preparations for biological science up to date by 
studying this subject at least four times a week? 
4) Did you spend as much time studying for biological science each week 
as you did for other courses? 
5) Do tests, in general, cause you to worry? 
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6) Do you feel that the taking of biological science weekly tests contrib- 
uted to your learning of the subject matter? 

7) Did you look up the answers to questions that you missed on the weekly 
quizzes? 

8) Do you think the weekly examinations measured fairly well your 
knowledge and understanding of biological science? 

9) Did the weekly tests make you dislike the course more than you would 
have without tests? 

10) Do you feel, considering the effort that you have put in this course, 
that you have learned as much in this course as in other courses? 

il) Do you feel that your work this term in biological science gave you 
more opportunity to find your individual study problems and to plan 
accordingly than do other courses? 

12) Do you feel that biological science this term has provided you with an 
opportunity to assume more responsibility for your educational growth 
and development than do other courses? 

13) Do you feel that your efforts in biological science have been about all 
that they could have been considering the other things that you have 
had to do this term? 

14) Do you feel that if you had had more tests this term that you would 
have studied more on biological science? 

15) Do you feel that if you had had fewer tests this term that you would 
have studied just about the same amount regardless of the number of 
tests? 

16) Did the weekly quizzes cause you to cram within twenty-four hours 
preceding each test? 

17) Do you feel that teacher evaluation is more important than self-evalu- 
ation? 

18) Do you feel, considering the many factors that enter into such a feeling, 
that, as a general statement, you have enjoyed this course as much as 
other courses this term? 

19) Did you enjoy studying biological science as much as you expected to 
at the beginning of the term? 

20) Did you look upon the weekly quizzes as opportunities to determine 
your strengths and weaknesses? 

21) Did the weekly quizzes help you plan subsequent study? 

22) Do you feel that you participated in a ‘democratic’ teaching-learning 
situation this term in biological science? 

23) If you had your choice of determining a testing program for the next 
term of biological science, would you select the method used this term? 


Table I gives the percentages of ‘yes’ and ‘no’ responses to the 
items in the questionnaire. Critical ratios for the ‘yes’ responses 
are also given in Table I. 

Items one through four in the questionnaire relate to some study 
habits. Inspection of Table I shows that a larger per cent of the 
students in the control group responded ‘yes’ to these items than 
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in the experimental group. However, the percentage differences 
between the two groups on these items were not statistically signi- 


TABLE I.—Per Cent or Responses MApE By STUDENTS IN CONTROL 
AND EXPERIMENTAL GROUPS TO 23 ITEMS ON THE UNSIGNED 


























QUESTIONNAIRE 
Per Cent ‘Yes’ | Per cent ‘No’ 
Item —— se - 
Exp.t Controlf | Difference) oDp§ D/eDp Exp. Control. | Difference 
1 29.7 36.7 7.0 8.8 .80 70.2 63.2 7.0 
2 25.5 32.3 6.8 8.5 .80 74.4 67 .6 6.8 
3 27 .6 36.7 9.1 8.7 | 1.05 72.3 60.2 12.1 
4 40.4 57.3 16.9 9.3 1.82 57.4 42.6 14.8 
5 48.9 41.1 7.8 9.4 .83 46.8 58.8 12.0 
6 15.2 83.5 68.3 7.5 | 9.11* 4.3 13.4 9.1 
7 32.6 52.9 20.3 9.1 | 2.23* | 10.8 44.1 33.3 
8 13.3 67.6 54.3 7.5 | 7.24* 4.4 29.4 25.0 
9 2.5 26.4 23.9 5.8 | 4.12* | 20.0 73.5 53.5 
10 87.2 75.0 12.2 7.1 1.72 12.7 23.5 10.8 
11 48.9 26.4 22.5 9.0 | 2.50* | 40.4 44.1 3.7 
12 57.4 39.7 17.7 9.3 1.90 36.1 48.5 12.4 
13 48.9 41.1 7.8 9.4 .83 51.0 57.3 6.3 
14 68.0 19.4 48.6 8.3 | 5.85* | 27.6 64.1 36.5 
15 15.5 50.0 34.5 8.0 | 4.31* | 35.5 50.0 14.5 
16 4.6 51.4 46.8 9.4 | 4.98* 4.6 48.5 43.9 
17 42.5 33.8 8.7 9.2 .95 51.0 52.9 1.9 
18 78.7 72.0 | 6.7 8.1 83 21.2 26.4 5.2 
19 74.4 67.6 6.8 8.5 .80 25.5 27.9 2.4 
20 27 .6 72.0 44.4 8.4 | 5.29* 6.3 25.0 18.7 
21 11.1 64.7 53.6 7.3 | 7.34* 6.6 32.3 25.7 
22 84.7 69.1 15.6 7.6 | 2.05* | 15.2 20.5 5.3 
23 62.7 72.0 9.3 8.9 1.04 | 37.2 23.5 13.7 























* Significant (Ratio 1.96 or greater). 

t Calculations based on the responses of 47 students in original experi- 
mental group. 

t Calculations based on the responses of 68 students in original control 
group. 

§ Calculations made from formula given by Henry E. Garrett, Statistics 
in Psychology and Education. New York: Longmans, Green and Co. 1944, 
page 228. 


ficant. From this finding, one may infer that the requiring of 
weekly tests did not necessarily produce significant changes in 
student behavior with respect to these study habits. It is interest- 
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ing to note, however, that 51.4 per cent of the students in the 
control group indicated that the weekly tests caused them to 
cram within twenty-four hours preceding each test (item 16). The 
difference between ‘yes’ responses for this item for the two groups 
was highly significant and the finding indicates that tests can be a 
stimulus for this kind of study. 

The fact is brought out in Table I, item 5, that more than forty 
per cent of the students in both groups indicated that tests, in 
general, cause them to worry. 

Since items 6, 7, 8, and 9 were marked as not applying to their 
section by from fifty to seventy-five per cent of the students in 
the experimental group no comparative interpretation of these 
items is offered. However, inspection of the responses of the stu- 
dents in the control group to these items seems to indicate that 
the students in the control group looked favorably upon the weekly 
test program. Eighty-three per cent of the students in the control 
group thought that the weekly tests had contributed to their 
learning of the subject matter and almost sixty-eight per cent 
thought that the tests had measured fairly well their knowledge 
and understanding of the course. Twenty-six per cent of the stu- 
dents in the control group, however, indicated that the taking of 
weekly tests caused them to dislike the course more than they 
would if they had not had the weekly tests. 

The responses to items 10 through 14 give some indication of 
student reaction to the course. It is to be noted that the positive 
responses of the students in the experimental group exceeded those 
of the students in the control group in all of these items. The 
factors operating in the experimental group apparently caused a 
significant number of the students to feel that the procedure used 
in this group provided them with an opportunity to find their 
individual study problems and to plan accordingly than did the 
procedures in other courses. It is also interesting to note that the 
students in the experimental group felt that if they had had re- 
quired tests that they would have studied more. Items 17, 18, and 
19 were also answered by a larger positive per cent by the students 
in the experimental group than by the students in the control group, 
but the differences were not significant. The results also show that 
a larger per cent of the experimental students felt that they would 
not select the same program again (item 23). In general the data 
seem to indicate that the experiment was looked upon favorably 
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by a majority of the students and that the students in both groups 
felt that they had learned as much in this course as in other 
courses; that they had enjoyed biological science as much as other 
courses; that they had participated in a ‘democratic’ teaching- 
learning situation; and that if they had a choice in determining the 
testing program for the next term of biological science that they 
would select the method used during the term. 


SUMMARY 


The more important findings and educational implications of this 
study may be summarized as follows: 

1) Students, as a group, who were required to take weekly tests 
during the course did not score significantly higher on a depart- 
mental term-end examination than those students who were not 
required to take weekly tests but had the tests available for self- 
testing and self-scoring. 

2) In terms of scores on final examinations, self-evaluation pro- 
cedures can be as effective as forced evaluation procedures. 

3) The two different evaluating procedures used in this study 
did not produce any significant changes in the study habits of the 
students as indicated by their responses to certain items on an 
unsigned questionnaire. 

4) Students in the experimental group indicated that they would 
have studied more had they been required to take more tests. 

5) More than forty per cent of the students in both groups 
indicated that weekly tests caused them to worry about taking 
tests. 

6) Approximately eighty-three per cent of the students in the 
control group indicated that the taking of weekly tests contributed 
to their learning of biological science. However, approximately 
fifty-one per cent of this group indicated that they crammed within 
a twenty-four-hour period preceding each weekly test. 

7) There was very little difference among students’ reactions to 
the type of testing program that was followed in their particular 
group. A majority of the students in both groups reacted favorably 
to the experiment. 














QUALIFICATION RESPONSES USED WITH 
PAIRED STATEMENTS TO MEASURE 
ATTITUDES TOWARD EDUCATION 


CHARLES O. NEIDT and LYLE D. EDMISON 


University of Nebraska 


‘ One of the major considerations in attitude measurement is that 
of demonstrating the appropriateness of the form of responding 
to items of a paper-and-pencil attitude scale. Inspection of pub- 
lished attitude scales reveals that forms of item responses are many 
and varied. In some cases, however, little evidence is presented to 
support the selection of the response form which was chosen by 
the test constructor for a particular measuring device. 

Perhaps the factor restraining constructors of attitude scales 
from presenting such evidence has been the lack of independent 
criterion behavior with which to evaluate the effectiveness of 
various types of item responses. In most instances, such criterion 
behavior has been difficult if not impossible to identify. It seems 
important, therefore, that when an attitude scale is being con- 
structed for which criterion behavior is available, several response 
forms for the scale should be investigated. This procedure yields 
evidence not only concerning the relative effectiveness of the dif- 
ferent item response forms for the particular attitude being meas- 
ured, but also evidence regarding their general applicability to 
other similar measurement situations in which no criterion behavior 
is available. 

It was the purpose of this study to determine the effectiveness 
of using a qualification response combined with a paired statement 
presentation of items on an attitudes toward education scale. It 
was felt that such a study would be of value for determining the 
effectiveness of this technique as well as that of attitude measure- 
ment for predicting academic success. 


PREVIOUS RESEARCH 


In a previous study involving the construction of an attitudes 
toward education scale, Dodds (1) composed two hundred seventy 
statements describing concomitants of the educational process 
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toward which students could reflect agreement or disagreement. 
When these items were presented to three hundred eighty students, 
she concluded that one hundred eighty items differentiated among 
students on the basis of total score. Neidt and Merrill (5) presented 
ninety of the most effective of Dodd’s items in two different item 
response forms to two hundred one students. These students re- 
acted to the ninety items in single statement form by responding 
on a five-point scale of the Likert type (3). The same students also 
responded to the items in paired-statement form by choosing the 
statement which most nearly represented their attitude. Thus, two 
attitude scores were obtained for each student. The score on each 
form of presentation was correlated with achievement, and the 
predictive effectiveness of each form was ascertained. It was con- 
cluded that only slight differences in effectiveness were demonstra- 
ble, but that the paired-statement form was the more feasible of 
the two forms since administration time for it was considerably less. 

The ninety statements comprising the forty-five-item paired- 
statement form of the attitude scale used in this study were the 
same items as those used by Neidt and Merrill, except for changes 
in content necessitated by substituting institutional names. 


THE QUALIFICATION RESPONSE 


When subjects are forced to choose between two statements the 
one which most nearly represents their attitude, the objection is 
frequently voiced that neither of ‘the paired statements presented 
is close enough to their attitude for them to accept it. Similarly, 
they often state that they would like to have chosen both state- 
ments, but were not permitted to do so. In an attempt to overcome 
such an objection, and to obtain further information about the 
reaction of subjects to each pair of statements, this study included 
a qualification response which immediately followed each pair of 
statements. The form of this qualification response was the same 
throughout the scale and was stated as follows: 

———Only the statement I checked represents my feelings. 


Both statements represent my feelings. 
Neither statement particularly represents my feelings. 








Thus, in responding to an item, the subject first checked the one 
of the paired statements which most closely represented his atti- 
tude, and then checked one of the three qualification statements. 
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For example, one of the forty-five items in complete form was: 


I try to work to the ut- I’m satisfied with pass- 
most of my capacity in ing grades even though 
my courses. 45— I could usually do better. 





Only the statement I checked represents my feelings. 
Both statements represent my feelings. 
Neither statement particularly represents my feelings. 











In summary, each subject checked one of the paired statements 
and one of the three qualification statements in responding to each 
of forty-five items. 


CONSTRUCTION OF THE KEY 


In constructing the keys for the paired statement response and 
the qualification responses, it was felt desirable to adopt a pro- 
cedure which would maximize the predictive effectiveness of each 
item and would minimize the obviousness of the scored response. 
The items were presented to three hundred forty-one University of 
Nebraska students in the test form previously described. This 
group was composed of two hundred twenty-seven freshmen and 
one hundred fourteen seniors. Since the experimental group was to 
be composed of sophomores, it was felt that freshmen and seniors 
would yield appropriate data for the item analysis upon which to 
base the key. 

One of the purposes of this study was to determine the effective- 
ness of attitudes as a predictor of scholastic success. Scholastic 
success was defined in this study as the average course mark ob- 
tained by each student during the semester when the scale was 
administered. It has repeatedly been demonstrated that scholastic 
aptitude scores are associated with a significant portion of the 
variation in course marks. To be effective as a contributor to the 
prediction of course marks, scores on attitude scales should also 
account for an additional significant amount of such variation. The 
criterion chosen for the construction of the key, against which to 
compare the responses to each item, was defined as the difference 
between the actual average course mark obtained by each student, 
and, the average course mark predicted for him from the regression 
of course marks on scholastic aptitude scores. This procedure tends 
to maximize the individual contribution of each independent vari- 
able when they are combined in a prediction scheme. 
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The linguistic and quantitative score of the American Council 
on Education Psychological Examination were the measures of 
scholastic aptitude used in this study. A two-variable regression 
equation was determined for the freshmen and for the seniors in 
the group. By substituting each student’s L and Q score into the 
appropriate regression equation a predicted average course mark 
was obtained. The algebraic difference between the obtained aver- 
age course mark and the predicted average course mark for each 
student was then recorded as the criterion for the construction of 
the key. 

The test forms of the upper twenty-seven per cent and the lower 
twenty-seven per cent of the criterion distribution were identified 
and the proportions of the upper and lower group marking each 
one of the paired statements and each one of the three qualifica- 
tion statements were recorded. With the use of Flanagan’s correla- 
tion table (2) the estimated correi:ation between each possible 
response and the criterion was obtained. Five such correlations 
were obtained for each of the forty-five items of the scale. 

Inspection of the estimated item-criterion correlation coefficients 
yielded by the foregoing procedure revealed that these correlations 
varied from .00 to .45. A weight of 2 was assigned the paired-state- 
ment responses correlating .20 or higher with the criterion and a 
weight of 1 was assigned to those correlating less than .20 with the 
criterion. Weights of 2, 1, and 0 were assigned to the qualification 
statements in rank order of their correlation with the criterion. 


ADMINISTRATION OF THE SCALE 


The attitudes toward education scale was administered to one 
hundred ninety-seven students who were enrolled in a sophomore- 
level course in educational psychology. Of these students, one 
hundred forty-three were women and fifty-four were men. One 
hundred forty-four were sophomores, thirty-seven were juniors, 
and sixteen were seniors. The scale was administered during a 
regular class period. About twenty-five minutes was required for 
completing the forty-five paired statements and the qualification 
responses. 

The average course mark each student attained during the 
semester in which the scale was administered was obtained from 
the Registrar’s office at the close of the semester. The Q and L 
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scores of the American Council on Education Psychological Exami- 
nation were also obtained for each student. 


RESULTS 


The distributions of the two attitude scores yielded by this 
scale, i.e., the paired-statement scores and the qualification re- 
sponse scores, were plotted and found to closely approximate nor- 
mality. The mean score for the qualification responses was 41.0 

. with a standard deviation of 4.7 and the mean score of the paired 
statements was 32.1 with a standard deviation of 4.9. The range for 
the qualification responses was from 56 to 28 or 28, and the range 
for the paired statements was from 43 to 19 or 24. 

To obtain evidence regarding the reliability of the two scores 
yielded by the scale, the split-half technique was employed. To 
divide the scale into two forms each representing half the scale, 
all items were first analyzed for difficulty. The items were then 
paired on the basis of their percentage difficulty and one of each 
pair was randomly assigned to each test form. Thus two forms 
were obtained from the paired statements and two forms from the 
qualification responses. The reliability coefficients yielded from this 
procedure after application of the Spearman-Brown formula were 
found to be .68 for the paired statements and .39 for the qualifica- 
tion responses. The foregoing procedure of splitting the scale was 
followed to more nearly assure equivalence of test forms from the 
differentially weighted items. 

To determine the effectiveness of the scale, the two attitude scale 
scores and the two scholastic aptitude scores were correlated with 
average course marks, individually and in combination. In Table I 
are shown the zero order, multiple, and partial correlations yielded 
by this procedure. Inspection of the zero order correlation coeffi- 
cients in this table indicates that the qualification response score 
correlated nonsignificantly (r = .053) with the criterion and very 
low with the other prediction variables. The correlation coefficients 
between the L and Q scores and the criterion are significant but 
somewhat lower than have been found in other studies using com- 
parable students at the University of Nebraska. 

When the qualification response score was omitted from the 
multiple correlation between the criterion and the prediction vari- 
ables, a reduction in the size of R from .445 to .435 was found, 
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which corresponds to the non-significant partial correlation of .045. 
Thus the qualification response score did not contribute signifi- 
cantly to the prediction of average course marks. 

When the paired-statement score was omitted from the three- 
variable prediction scheme, the multiple correlation was reduced 
from .435 to .364—a significant reduction as indicated by the 
partial correlation of .263 between average course marks and 
paired-statement scores with L and Q scores held constant. 


TABLE I.—CoORRELATION COEFFICIENTS FOR CouRSE MARKS, ATTITUDE 
Scores AND ScHOLASTIC APTITUDE SCORES 























Cae eared CLD | Ee Score (Xs) |Q Score (X%) 

--- |= 
Course marks (Y) .053 253 356 | 202 
Qual. response (X;) .052 .066 .063 
Paired state (X2) .036 | .031 
L score (X;) |  .479 

Rytxyxex3x) = .445 Ry(rox3x) = .435 Ry(x3x) = .364 
yx ,-Xex3x¥4 = .045 TyX¥e-X3xX4 = . 263 
IMPLICATIONS 


The results of this study provide evidence for two inferences. 
First, although the use of the qualification response combined with 
the paired statement response overcame the objections frequently 
voiced by subjects when they are forced to choose between paired 
statements, its quantitative score was not effective under the 
circumstances of this study. Perhaps other testing circumstances 
and scoring procedures will yield more favorable evidence. 

Second, the significant contribution of the paired-statement score 
to the prediction of academic success emphasizes the possibilities 
of combining attitude measures with other prediction variables 
for obtaining more accurate prediction. Although the forty-five 
items used in this scale had been rather carefully analyzed and 
pretested before this study was undertaken, they undoubtedly can 
be improved and additional effective items can be constructed. 
Evidence from this study supports the position that attitude meas- 
urement can significantly improve the prediction of academic suc- 


cess. 
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SUMMARY 


An attitudes toward education scale composed of forty-five 
paired statements each followed by a three-category qualification 
response was administered to one hundred ninety-seven University 
of Nebraska students. The key for scoring this scale was con- 
structed by obtaining the estimated correlation between each possi- 
ble item response and an academic achievement criterion in another 
sample of three hundred forty-one students. The criterion for the 

‘ key construction was obtained by subtracting an expected average 
course mark (predicted from each student’s scholastic aptitude 
scores) from the average course mark which each student actually 
attained. Weighted scores were obtained for the paired statements 
and for the qualification responses of the one hundred ninety-seven 
students. 

Partial correlation coefficients were obtained between average 
course marks and the two attitudes scores with L and Q scores of 
the ACE held constant. These correlations were found to be .045 
for the qualification response scores and .263 for the paired state- 
ment scores. It was concluded that the score on the qualification 
response did not contribute significantly to the prediction of course 
marks, whereas the paired statement score yielded a significant 
contribution to the prediction scheme. 
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BOOK REVIEWS 


Henry Bowers. Research in the Training of Teachers. ‘Toronto, 
Canada: J. M. Dent & Sons (Canada) Ltd., 1952, pp. 167. 
$1.90. 


From the standpoint of the lay reader’s understanding of reports 
of scientific investigations this little volume could well serve as a 
model. Since most of those concerned with teacher-training are 
not generally highly trained in statistics, this ease of reading is a 
most commendable characteristic. Nor does this imply any evidence 
that the book’s author is not thoroughly conversant with the 
statistical tools appropriate to his task. He evidently has kept al- 
ways in mind the reading audience. 

The book represents an enormous amount of patient, well- 
directed, panistaking labor. The spirit in which the book was 
written is well stated in the Preface: ‘Only a small part of the 
research carried out in the Stratford and Ottawa Normal Schools 
is included in the papers now published; nine-tenths of the glacial 
mass is submerged. This submergence was the result of self-criti- 
cism which, at least, felt ruthless, and a desire to maintain inner 
standards of procedure. These remarks are made without implica- 
tion that tables have been handed down from Mount Sinai.”’ 

The contents are organized as fourteen ‘Papers,’ each of which 
reports one or more experiments to determine the importance of a 
rather exhaustive list of variables: ‘I. Condensation of the Aca- 
demic Records of Ontario Secondary School Pupils Who Possess 
the Academic Requirements for Entering a Provincial Normal 
School; II. The Appearance of the Student-teacher; IIIT. Concomi- 
tants of the Marks Obtained in the Term Examinations of the 
Normal School; IV. Traits of Personality Associated with the 
Degree of Success of Student-teachers in Practice-Training; V. 
Concomitants of the Marks Given for Practice-teaching; VI. The 
Bearing of Miscellaneous Activities and Interests of Student- 
teachers on Achievement in Practice-teaching; VII. Qualities of 
Sociability and Leadership Possessed by Student-teachers; VIII. 
The Relationship of Height and Weight of Student-teachers to the 
Quality of Their Practice-teaching; IX. The Aptitude Test for 
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Elementary School Teachers-in-Training as an Instrument for Pre- 
diction; X. Growth of Student-teachers in Certain Traits of Person- 
ality During Their Period of Training; XI. Sex-differences in 
Ratings of Certain Traits of Personality; XII. A Comparison of 
the Achievement of Male and Female Student-teachers; XIII. 
Variation of Standards in the Marking of Lessons Taught by 
Student-teachers; XIV. The Feasibility of ‘Homogeneous’ Group- 
ing in a Normal School.” 

. Criterion measures were in general such as are assumed to be 
related to effectiveness in teaching. It is to be hoped that these 
well-designed, careful researches will be extended to include the 
more ultimate criterion of degree of desirable changes in pupils 
taught by teachers with differing characteristics. A readily avail- 
able criterion not investigated is obviously the attitude of pupils 
taught by student-teachers, a variable almost certainly closely 
related to pupil motivation and learning. 

A few rather minor criticisms include the following: There is no 
index—not too serious, however, since each paper is relatively short 
and self-contained with a summary and conclusions. Tables are not 
numbered in sequence throughout the book, but within each paper. 
The one graph (p. 3) rather confusingly reverses the usual Car- 
tesian coédrdinate system on the abscissa. The reviewer detected a 
few errors, either typographical or computational. For example, 
the Spearman-Brown r on page 12 is incorrect; it should be .84, 
not .88. The first percentile value given in Table I, p. 14, must 
be in error. A page reference on page 27 is given as ‘page §§fp. 
On page 59 the two rho’s should be .74 and .73, respectively, not 
.60 and .78. Some account might well have been taken of the 
extensive—albeit not too productive—research literature on 
teacher effectiveness in the United States. _ 

These criticisms, however, are minor as compared to the light 
thrown on the import of student-teacher characteristics and on 
training procedures in the normal school situation in Ontario. Few 
teacher-training institutions are fortunate enough to have as chief 
administrator such a scientifically-minded and pertinacious pur- 
suer of facts and generalizations relevant to his highly important 
professional job. H. H. REMMERS 


Purdue University 
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Otto Potuack ET AL. Social Science and Psychotherapy for Chil- 
dren. New York: Russell Sage Foundation, 1952. pp. 242 


This is the result of a joint effort made possible by the Russell 
Sage Foundation and the Jewish Board of Guardians and directed 
by Otto Pollack with a group of nine collaborators. It attempts to 
give evidence, with considerable success, of ‘‘significant benefits 
derived from the integration of pertinent data and principles se- 
lected from the fields of sociology, cultural anthropology, and social 
and learning psychology into the psychoanalytically oriented child 
guidance practice of that agency.’’ (p. 7.) ““This book is a report 
of exploration into the question of whether existing funds of social 
science knowledge can be adapted to psychotherapy practiced in 
a child guidance setting.”’ (p. 9.) It presents the therapeutic ap- 
proach of the Jewish Board of Guardians in the belief that it will 
have interest and applications for a wide audience. 

The subjects dealt with in various chapters by different col- 
laborators include: adapting social science to child guidance prac- 
tice; the concept of ‘Family of Orientation’ in diagnosis and ther- 
apy; social interaction and therapy; extra-familial influences in 
pathogenesis; culture and culture conflicts in psychotherapy; age- 
sex réles and psychotherapy of adolescents; the therapeutic man- 
agement of anxiety in children; the utilization of volunteers in 
sociodynamic psychotherapy; limited treatment goals; and an 
evaluation from the psychiatric point of view. 

The results presented in this volume are those of a two-year 
venture to develop a “liaison between the behavior sciences and a 
specialty in social practice, child guidance.”’ (p. 7) 

One of the special emphases is upon ‘The Concept of Family of 
Orientation.’ This means “‘the sum total of persons who form con- 
tinuing members of the household in which the child grows up, 
that is, the primary group at the home.” (p. 42). This must be 
carefully distinguished from the ‘family of procreation.’ Failure 
to make this distinction is thought to be very serious and to occur 
altogether too frequently. 

Interpersonal relations, of course, are of prime concern through- 
out the several discussions and the value of the volume lies not 
only in the general overview of these relationships, but also in 
the pertinent and concrete illustrations of the kinds of difficulties 
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that arise in individual cases. These should be known and under- 
stood by the clinical worker but often are not known. 

Drinking milk during meals is a violation of the food ritual of 
Jewish law. The worker did not understand this. The culture of 
the worker did not permit understanding of the particular problem 
of the child. Some customs ir are not what the child has 
been taught is ‘kosher.’ Situa..jus arise which the non-Jewish 
worker may not understand. ‘‘Camp experiences may require the 
child to eat food which is not prepared according to the orthodox 
food ritual practiced in the home.” In general, cultural differences 
between worker and patient may lead to serious misunderstandings 
and uninformed methods of therapy. 

The worker needs to understand that one may be quite normal 
and adjusted to situations that follow his education and training 
but seriously disturbed in an environment in which his habits and 
customs are neither followed nor understood. An intake worker 
recorded a “‘referral from a Hebrew school, apparently not being 
aware of or not appreciating the difference in ideology between a 
socialistically oriented Yiddish school and the orthodoxy and con- 
servatism of a tradional Hebrew school (Cheder).” (p. 122). 

The report explains what seems to have been a successful coédp- 
erative effort in the development of means and methods in psycho- 
therapy. The point of view is essentially psychoanalytic and the 
concern of the authors is with “‘the particular therapeutic approach 
adopted by the Jewish Board of Guardians.’’ The concrete examples 
make it especially of interest and value to all students of psycho- 


therapy. 
A useful and well made index (pp. 235-242) concludes the vol- 
ume. A. 8. Epwarps 


The University of Georgia 
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G. Mitton Smita. More Power to Your Mind. New York: 
Harper and Brothers, 1952, pp. 180. 


This popular treatise on mental hygiene is concerned with the 
daily problems of life and of personal relationships. It is designed 
for those individuals who are needlessly operating at a level of 
effectiveness which is unnecessarily low due to relatively minor 
conflicts and frustrations. Self-understanding is fostered so that 
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one may deal intelligently with everyday problems of personal, 
business, social and family life. A one-sided approach is avoided. 
The author believes that ‘“‘no system which focuses exclusively on 
training the mind, or disciplining the body, or exalting the spirit 
is adequate to bring our performance in line with our capabilities. 
We need all three.” 

In the thirteen short chapters, discussions are concerned with 
such topics as needs of the self, how emotional needs and con- 
flicts arise and how they may be dealt with, learning for effective 
living, satisfaction derived from work, the mind-body team, ad- 
justments to sex needs, and the rdéle of family life in effective living. 

Although written in popular style, this book is based upon sound 
psychological information. It is written in a simple, clear and 
forceful style. The balance and sense of proportion are excellent. 
Since the discussion presupposes no technical background, the 
material will be readily comprehended by any intelligent reader. 
This book should not only have great vogue, but also should be a 
boon in promoting better adjustment through an improved under- 
standing of self needs. Mies A. TINKER 

University of Minnesota 


MARGARET MEAD AND FRANCES CooKE MacGrecor. Growth and 
Culture: A Photographic Study of Balinese Childhood. New 
York: G. P. Putnam’s Sons, 1951, pp. 223. 


The work of Arnol Gesell and his colleagues has made a signifi- 
cant and lasting impression on the science of developmental hu- 
man behavior. Their findings concerning specific growth patterns 
in infant and child development, and their concept of ‘growth 
spirals,’ have influenced profoundly the thinking of present-day 
child psychologists, educators, and many parents. 

In this book, two well-known anthropologists have attempted 
to see to what extent the various growth patterns which the Gesell 
workers found in their New Haven subjects would appear in chil- 
dren living in a village of Bali (now a part of the Republic of Indo- 
nesia). They also were interested in finding out whether or not 
the Balinese culture would cause significant differences to appear 
in these children as contrasted with Americans. What they dis- 
covered was this: The general stages of developmental behavior 
are very much the same for Balinese as for American children. 
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There are some differences, however, the sequence from creeping 
to walking being an outstanding one. Seen against Gesell’s spiral 
analysis of development, the Balinese children tend to neglect the 
crawling stage, and to some degree the creeping stage as well, 
but they reinforce the frogging-squatting sequence that American 
children neglect. Other developmental areas in which contrasts ap- 
pear are related primarily to the emphasis on flexibility and high 
tonus which is encouraged in Balinese children. 

‘Emerging from these and other comparisons of developmental 
behavior is the keynote of the entire book: maturation and learning 
interact in the production of the various developmental stages 
studied. While biology so fixes these stages that in a great many 
ways Balinese development is like that of American children, yet 
cultural differences also go to determine differences in develop- 
mental patterns. 

The reader whose background is closely related to cultural an- 
thropology will probably find to his satisfaction that this already 
widely-accepted principle of maturation-learning interaction is ably 
discussed and amply buttressed with appropriate observational 
data and interpretations. In addition, he may well find, along with 
the reader whose primary training and interest lies somewhat out- 
side of anthropology, that the most rewarding part of the book is 
the set of fifty-eight photographs of Balinese children and adults 
which illustrate the character of the various developmental stages 
and relationships described in the text. (These plates were selected 
from pictures made by Gregory Bateson between 1936 and 1939, 
when with Margaret Mead, he collected data for his later published 
work in 1942, Balinese Character.) The arrangement of the photo- 
graphic plates, the accompanying explanations for each, their rele- 
vance to the developmental tasks discussed, and the sheer photo- 
graphic fidelity with which they present what the authors are trying 
to explain—these are all excellent, and, in the reviewer’s opinion, 
are what really make the book different and new. 

These photographs drive home, indeed, the principle that culture 
can modify developmental behavior patterns. But they also show— 
and this is a virtue of the photographic approach, that it can do 
this so dramatically and clearly—how very much like their Ameri- 
can age-mates, after all, these Balinese children are. 

University of Illinois FRANK COosTIN 
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James L. Mursetyu. Using Your Mind Effectively. New York: 
McGraw-Hill Book Company, 1951, pp. 254. 


In spite of the more general implication of the title and the 
author’s protestation against confining the concept of mental effec- 
tiveness to scholastic efficiency, Using Your Mind Effectively can 
be described best as a ‘how-to-study’ book written in popular style. 
The theme upheld throughout the volume is that the achievement 
of better thinking is the requisite for attaining greater adeptness 
in studying. Since the desideratum of better thinking is learnable 
and teachable, Mursell’s task is to show the way to the reader. 
The steps in reaching comprehension and understanding are for- 
mulated as: (1) obtaining an over-all view of the material under 
consideration, (2) identifying its main features, and (3) working 
the details into their proper relationship to the main skein. 

After an introductory section on the value of mental efficiency, 
the book is broken down into three main divisions; one dealing 
with an exposition of the general psychological principles to be 
observed in attaining intellectual mastery and two devoted to the 
demonstration of practical applications. In Part One the author 
elaborates on the nature and usefulness of the sequence; picking up 
the main thread, identifying essentials, and relating the details to 
the whole. He discusses the importance of this sequence in the 
reading of textbooks, study of college courses, and the writing of 
term papers and theses. The construction of a mental map into 
which essential parts and details may be fixed is offered as the 
modus operandi for achieving mental excellence. Applications of 
this technique to non-academic situations are demonstrated, e.g., 
preparing a talk, orienting oneself to a new job, arranging one’s 
budget, and learning to play tennis. 

In Part Two standard topics of the ‘how-to-study’ books such as 
budgeting time, note-taking and note-using, self-testing, and con- 
centration are discussed under the rubric of working tools and 
practical plans. The general rationale for the recommended ap- 
proaches as well as the minutiae of procedural detail is set forth. 
Much of this is the stock advice on these topics of ‘how-to-study’ 
books. The author appears at times to be self-conscious about some 
of the trivia which he presents. After marshalling a lengthy list of 
materials and equipment that should be kept in the study room, 
he feels constrained to remark that he offers this for whatever it is 
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worth. In this division of the book the leitmotif is that the facilita- 
tion of thinking should be the governing consideration in the 
arrangement of time, conditions, and methods of work. Everything 
should be instrumenta: to the goal of encouraging thinking; note- 
taking and review are seen as experiences in thinking. 

In Part Three under the title “Some More Extended Applica- 
tions” appear the following topics: memorizing, reading, writing 
term papers and theses, and creative thinking. Good memorization 
is viewed as conditional upon proper purposing, understanding, and 
noticing. Analyzing, interpreting, and thinking are considered the 
crux of the development of reading skill. In writing term papers, 
it is recommended that the outline and over-all plan be completed 
before a single word of the final manuscript is written. In a very 
short chapter on creative thinking, Mursell makes the point that 
whenever one grasps the meaning of another person’s writing, one 
is engaged in creative thinking. The author makes clear that his 
aim in offering suggestions is not to facilitate the functioning of 
routine-minded workers, but to influence the intellectual worker 
to approach even mundane tasks as invitations to creative think- 
ing. He concludes his work with the statement that he has tried 
to make everything in the book center on creative thinking. 

After finishing this book, the reader might legitimately ask what 
progress psychology has made, since William James, in offering 
assistance to students in their pursuit of academic success. No 
later writer seems able to go beyond or even approach the level of 
James’ Talks to Teachers on Psychology: and to Students on Some 
of Life’s Ideals published in 1899. Basically little that is new since 
Kitson’s How To Use Your Mind appears evident in the more 
recent books on ‘how to study’. Aside from the Q 3 R technique 
of Ohio State University for studying chapters in the convention- 
ally organized textbook and the ‘push on’ or quick reading method 
for mastering foreign languages, there is in the book under review 
scarcely any reference to new developments in effective study 
methods. Psychologists who choose to write ‘how-to-study’ books 
seem to feel obliged to approach their topics as exclusively cog- 
nitive-intellective matters unrelated to the emotions and the total 
personality. They appear to be oblivious to the psychoanalytic 
movement and to dynamic psychology in general. Mursell fails to 
suggest that the inability to understand and think effectively may 
be due to emotional factors as much as to poor techniques in 
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cognition. The failure to integrate cognitive functions with the 
entirety of the personality accounts for the antediluvian character 
of most ‘how-to-study’ books. 

Mursell has done a good job of expounding the traditional ap- 
proach for developing mental efficiency. However, at times, the 
style is repetitious, long-winded, and too general without the com- 
pensation of being inspirational—hence boring. The standard of 
writing at times seems to dip below the college level to which it 
is apparently directed. The attempt to offer applications to non- 
academic pursuits appears to be simply a gesture. For those who 
want a simple and very clear exposition of the conventional views 
on effective use of the mind, this book may be most satisfying. 

Puitie M. Kiray 

Adelphi College 


C. W. Ope How to Improve Classroom Testing. Dubuque, Iowa: 
William C. Brown Company, 1953, pp. 156. 


Professor Odell has prepared a student manual that emphasizes 
the practical and non-technical aspects of the construction and 
administration of informal or teacher-made tests of achievement. 
Students of education will be pleased to find that he sets the prob- 
lem of measuring achievement in the context of curriculum de- 
velopment by taking the objectives of education as the definitions 
of the achievements that are desired. He then concentrates on 
testing programs and types of tests, excluding problems of intel- 
ligence and personality measurement. Chapters IV through XII 
are devoted to test construction principles and illustrations of 
various types of test items or problems; these illustrations are 
drawn from a number of subject matter fields. The manual appears 
to be a good source of ideas for test item types, and as such it 
should be of value to teachers. One chapter on administration and 
scoring and one on statistical methods in connection with testing 
complete the volume. The chief contribution of the manual is its 
rather extensive and practical advice on how to develop and use 
informal tests. CuHesterR W. Harris 

The University of Wisconsin 
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